Abstract:Deep neural network compression methods with a single and fixed pattern are difficult to compress the model sufficiently due to the limitation of accuracy loss. As a result, the compressed model still needs to consume costly and limited storage resources when it is deployed, which is a significant barrier to its use in edge devices. To address this problem, this article proposes an adaptive joint compression method, which optimizes model structure and weight bit-width in parallel. Compared with the majority of existing combined compression methods, adequate fusion of sparsity and quantization methods is performed for joint compression training to reduce model parameter redundancy comprehensively. Meanwhile, the layer-wise adaptive sparse ratio and weight bit-width are designed to solve the sub-optimization problem of model accuracy and improve model accuracy loss due to the fixed compression ratio. Experimental results of VGG, ResNet, and MobileNet using the CIFAR-10 dataset show that the proposed method achieves 143. 0 ×, 151. 6 ×, and 19. 7 × parameter compression ratios. The corresponding accuracy loss values are 1. 3% , 2. 4% , and 0. 9% , respectively. In addition, compared with 12 typical compression methods, the proposed method reduces the consumption of hardware memory resources by 15. 3×~ 148. 5×. In addition, the proposed method achieves maximum compression ratio of 284. 2× whilemaintaining accuracy loss within limited range of 1. 2% on the self-built remote sensing optical image dataset.