Abstract:Dioxin (DXN), known as “ the poison of the century”, is one of the by-products emitted by the municipal solid waste incineration (MSWI) process. Limited by the technical difficulty, time and economic cost of DXN emission concentration detection, it is difficult to obtain sufficient labeled samples for building a DXN emission soft sensor model. To effectively utilize a large number of unlabeled samples collected by the field control system and solve the problem of poor generalization performance of traditional shallow learning models, a soft-sensor method of DXN emission concentration based on Bagging semi-supervised deep forest regression (DFR) is proposed. First, multiple training subsets are obtained by resampling the original labeled dataset based on the Bagging mechanism, and multiple random forest (RF) models with diversities are formulated. Then, the RF model is iteratively updated, the nearest neighbor set is selected and the generalization performance strategies are evaluated, which are all used to obtain high-confidence pseudo-labeled samples. Finally, a DFR model is constructed based on the pseudo-labeled and original labeled sample sets. The effectiveness of the proposed method is evaluated with the actual DXN detection data of MSWI power plant in Beijing. It shows that the propose method has well prediction stability, and the root mean square errors are 0. 015 50, 0. 020 23 and 0. 019 73 for training, validation and testing datasets respectively.