Abstract:This article proposes a variable step size LMS algorithm based on deep reinforcement learning to address the problem of the difficult balance between convergence speed and steady-state error in the fixed step size LMS algorithm, as well as the high dependence on initial parameter selection, heavy workload, and subjective defects of traditional variable step size algorithms. This algorithm has a low dependence on initial parameters and avoids the cumbersome parameter tuning process. Firstly, an algorithm model integrating deep reinforcement learning and adaptive filtering is constructed, which utilizes deep reinforcement learning agents to control the change of step size factors, replacing the nonlinear function used for step size adjustment in traditional variable step size algorithms, thereby avoiding the cumbersome experimental parameter tuning process and reducing the complexity of algorithm use. Secondly, the error-based state reward and step size-based action reward functions are proposed. Dynamic rewards and negative reward mechanisms are introduced, which effectively improves the convergence speed of the algorithm. In addition, a network architecture based on incomplete encoders is designed to improve the inference ability of reinforcement learning strategies. Through experimental verification, compared with other newer variable step size algorithms, the algorithm proposed in this article can quickly adjust to a reasonable step size value under different initial parameters and reduce the workload of experimental parameter tuning, obtaining faster convergence speed and smaller steady-state error. The trained network has been applied to practical fields, such as system identification, signal denoising, and filtering of water level signals at the closure gap, and has achieved good performance, further confirming the generalization ability and effectiveness of the algorithm.