In order to reduce Chinese text similarity calculation complexity and improve text clustering accuracy, this paper proposes a new text similarity calculation algorithm based on DF_LDA. First, we use DF method to realize feature extraction; then, we use LDA method to construct text topic model; finally, we use DF_LDA model obtained to calculate text similarity. Due to considering the text semantic and word frequency information, the new method can improve text clustering precision. In addition, DF_LDA method reduces text feature vector dimensions twice; it can efficiently save text similarity c...