Traffic flow forecasting plays an important role in the intelligent traffic system, which is the basis for traffic control and traffic management. However, due to the complex spatial-temporal dependence, traffic flow forecasting has always been a difficulty in the field of intelligent traffic. In order to select a suitable spatialtemporal forecasting method and solve the problem that recurrent neural architecture is not conducive to parallel computing, we construct a spatial-temporal forecasting model by using multi-head attention models. Use g...