Partial discharge pattern recognition (PDPR) is the fundamental cornerstone for fault diagnosis. It has emerged as a pivotal focal point in the field of power systems. However, PDPR faces several challenges, such as low signal quality and complex discharge patterns. We propose a multiscale residual aggregation transformer network (MRATNet) to address these challenges effectively. MRATNet learns long-dependent semantic relationships and discriminative features in partial discharge (PD) signals. Moreover, it integrates convolutional and transformer architectures as the feature extraction backbon...