电力变压器缺陷记录嵌套实体识别方法
DOI:
CSTR:
作者:
作者单位:

福州大学电气工程与自动化学院福州350108

作者简介:

通讯作者:

中图分类号:

TM721TH86

基金项目:

福建省高校产学合作项目(2023H6006)资助


Nested entity recognition method for power transformer defect records
Author:
Affiliation:

College of Electrical Engineering and Automation, Fuzhou University, Fuzhou 350108, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随着近年来电网规模的不断扩大,海量非结构化的设备缺陷记录被采集与积累。这些数据中蕴涵着对设备状态评估与运维决策至关重要的实体信息。然而,此类数据中普遍存在的实体嵌套结构导致了实体边界模糊性与上下文语义复杂性的提升,对传统的命名实体识别方法构成了显著挑战。为实现对缺陷记录嵌套实体精准识别,以电力变压器为典型设备,提出了一种乱序预训练语言模型(PERT)、双向门控循环单元(BiGRU)与高效全局指针网络(EGP)相结合的电力变压器缺陷记录命名实体识别方法。该方法首先以PERT模型作为向量嵌入层进行动态语义编码,利用其乱序预训练特性深度捕捉文本的上下文依赖关系;随后引入BiGRU网络作为文本编码层,通过双向门控机制,更加全面地进行语义特征提取;最后采用EGP作为解码输出层,聚焦实体跨度与边界,实现对嵌套实体的精准抽取,避免了传统条件随机场(CRF)解码的标记冲突问题。实验结果表明,该实体识别方法能有效应对嵌套实体挑战,其综合F1分数达96.01%,较传统嵌套识别方法双向变换器预训练模型-双仿射注意力(BERT-Biaffine)以及双向变换器预训练模型-机器阅读理解(BERT-MRC)分别高出0.55%和0.70%。在全部5类实体标签识别中均取得了最高的F1分数。特别地,在嵌套现象最显著的缺陷设备与缺陷部位识别上,所提方法的F1分数分别达到100%与94.74%,相较最优基准模型提升了0.57%与0.13%。

    Abstract:

    With the continuous expansion of power grid scale in recent years, a vast amount of unstructured equipment defect records has been collected and accumulated. This data contains entity information crucial for equipment condition assessment and operational decision-making. However, the prevalent nested entity structures in such data lead to increased entity boundary ambiguity and contextual semantic complexity, posing significant challenges to traditional named entity recognition methods. To achieve accurate identification of nested entities in defect records, this paper takes power transformers as a typical example and proposes a named entity recognition method for power transformer defect records that combines pre-training bidirectional encoder representations from transformers with permuted language model(PERT), bidirectional gated recurrent unit(BiGRU), and efficient global pointer(EGP). This method first employs the PERT model as a vector embedding layer for dynamic semantic encoding, leveraging its permuted pre-training characteristics to deeply capture contextual dependencies. Subsequently, a BiGRU network is introduced as the text encoding layer to comprehensively extract semantic features through its bidirectional gating mechanism. Finally, EGP is adopted as the decoding output layer to focus on entity spans and boundaries, enabling precise extraction of nested entities while avoiding the label conflict issues associated with traditional CRF decoding. Experimental results show that this entity recognition method effectively addresses the challenges of nested entities, achieving a comprehensive F1 score of 96.01%, which is 0.55% and 0.70% higher than those of the bidirectional encoder representations from transformers-biaffine attention(BERT-Biaffine) and bidirectional encoder representations from transformers-machine reading comprehension(BERT-MRC), respectively. It attained the highest F1 scores across all five entity label categories. Specifically, for defect equipment and defect component recognition, where nesting is most prominent, the F1 scores of the proposed method reached 100% and 94.74%, representing improvements of 0.57% and 0.13% over the best baseline models.

    参考文献
    相似文献
    引证文献
引用本文

缪希仁,曹喜龙,江灏,陈静,林蔚青.电力变压器缺陷记录嵌套实体识别方法[J].仪器仪表学报,2026,47(2):161-172

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2026-04-08
  • 出版日期:
文章二维码