Convolutional neural networks (CNNs) have become one of the most popular tools to tackle hyperspectral image (HSI) classification tasks. However, CNN suffers from the long-range dependencies problem, which may degrade the classification performance. To address this issue, this letter proposes a transformer-based backbone network for HSI classification. The core component is a newly designed double-attention transformer encoder (DATE), which contains two self-attention modules, termed spectral attention module (SPE) and spatial attention module ...