VoxelFSD: voxel-based fully sparse detector with sparse convolution for 3D object detection

Home > Archive>Volume 46, Issue 5, 2025 >242-250

VoxelFSD: voxel-based fully sparse detector with sparse convolution for 3D object detection
DOI:
                        
CSTR:
                        
Author:
                        
Affiliation:1.School of Automation, Southeast University， Nanjing 210096, China; 2.Key Laboratory of Measurement and Control of Complex Systems of Engineering, Ministry of Education, Southeast University， Nanjing 210096, China
Clc Number:TP391.4TH865
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Voxel-based 3D object detection methods often suffer from poor real-time performance when processing large-scale LiDAR point clouds due to their heavy dependence on dense 2D backbone networks. In this paper, we propose VoxelFSD, a voxel-based fully sparse 3D object detector that significantly enhances the real-time capability of long-range detection. The model features three core components: Firstly, parallel convolutional branches (PCB), which expand the receptive field and comprehensively extract object features while mitigating the impact of missing object center features; Then, a sparse region proposal network (SRPN) head that predicts objects sparsely, reducing redundant computations compared to dense prediction and thus improving efficiency for large-scale point clouds; Finally, an ROI head with an attention fusion module (AFM-ROI) that employs cross-attention to effectively fuse 3D backbone features with compressed bird′s eye view (BEV) features in the second stage, refining object representation for improved detection accuracy. By removing the dense 2D backbone from traditional voxel-based detectors and integrating PCB and SRPN, we first present VoxelFSD-S, a fully sparse, single-stage, lightweight detector that achieves a superior balance between speed and accuracy relative to existing lightweight voxel-based models. Building upon VoxelFSD-S, we introduce VoxelFSD-T, a two-stage detector enhanced with AFM-ROI, which boosts accuracy with minimal additional computational cost. On the KITTI test set, VoxelFSD-S and VoxelFSD-T achieve accuracies of 77.67% and 81.50% , respectively.

Reference

Cited by

Get Citation

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:
Revised:
Adopted:
Online: August 12,2025
Published:

Home

Introduction

Current Issue

Editorial Committee

Policy

Contact Us

中文版

Get Citation

Related Videos

Share

Article Metrics

History

Article QR Code