作者机构:
[Jiang, Xingpeng; Yang, Jincai; He, Tingting; Shen, Xianjun; Hu, Xiaohua; Chen, Yao] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Shen, Xianjun] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
关键词:
*Disease network;*Heterogeneous network;*Microbe network;*Random walk
摘要:
As we all know, the microbiota show remarkable variability within individuals. At the same time, those microorganisms living in the human body play a very important role in our health and disease, so the identification of the relationships between microbes and diseases will contribute to better understanding of microbes interactions, mechanism of functions. However, the microbial data which are obtained through the related technical sequencing is too much, but the known associations between the diseases and microbes are very less. In bioinformatics, many researchers choose the network topology analysis to solve these problems. Inspired by this idea, we proposed a new method for prioritization of candidate microbes to predict potential disease-microbe association. First of all, we connected the disease network and microbe network based on the known disease-microbe relationships information to construct a heterogeneous network, then we extended the random walk to the heterogeneous network, and used leave-one-out cross-validation and ROC curve to evaluate the method. In conclusion, the algorithm could be effective to disclose some potential associations between diseases and microbes that cannot be found by microbe network or disease network only. Furthermore, we studied three representative diseases, Type 2 diabetes, Asthma and Psoriasis, and finally presented the potential microbes associated with these diseases by ranking candidate disease-causing microbes, respectively. We confirmed that the discovery of the new associations will be a good clinical solution for disease mechanism understanding, diagnosis and therapy.
作者机构:
[Ma, Yuanyuan] Cent China Normal Univ, Sch Informat Management, Wuhan, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Hu, Xiaohua] Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.;[Ma, Yuanyuan] Anyang Normal Univ, Anyang, Peoples R China.
通讯机构:
[Jiang, Xingpeng] C;Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.
会议名称:
IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM)
会议时间:
DEC 15-18, 2016
会议地点:
Shenzhen, PEOPLES R CHINA
会议主办单位:
[Ma, Yuanyuan] Cent China Normal Univ, Sch Informat Management, Wuhan, Peoples R China.^[Hu, Xiaohua;He, Tingting;Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.^[Ma, Yuanyuan] Anyang Normal Univ, Anyang, Peoples R China.
会议论文集名称:
IEEE International Conference on Bioinformatics and Biomedicine-BIBM
关键词:
Human Microbiome;Laplacian Regularization;Multi-view Clustering;Symmetric Nonnegative Matrix Factorization
作者:
Luo Changri;Zhang Xinhua*;He Tingting*(何婷婷);Huang Baohua;Wu, Shaojing;...
期刊:
PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC),2017年:2286-2290
通讯作者:
Zhang Xinhua;He Tingting
作者机构:
[Luo Changri; Xie, Yaohui] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Sch Vocat & Continuing Educ, Wuhan, Hubei, Peoples R China.;[Zhang Xinhua] Wuhan Vocat Coll Software & Engn, Sch Comp Sci, Wuhan, Hubei, Peoples R China.;[He Tingting] Cent China Normal Univ, Acad Comp Sci, Wuhan, Hubei, Peoples R China.;[Huang Baohua] Cent China Normal Univ, Sch Vocat & Continuing Educ, Wuhan, Hubei, Peoples R China.;[Wu, Shaojing] Cent China Normal Univ, Sch Informat Management, Natl Engn Res Ctr E Learning, Wuhan, Hubei, Peoples R China.
通讯机构:
[Zhang Xinhua] W;[He Tingting] C;Wuhan Vocat Coll Software & Engn, Sch Comp Sci, Wuhan, Hubei, Peoples R China.;Cent China Normal Univ, Acad Comp Sci, Wuhan, Hubei, Peoples R China.
会议名称:
3rd IEEE International Conference on Computer and Communications (ICCC)
会议时间:
DEC 13-16, 2017
会议地点:
Chengdu, PEOPLES R CHINA
会议主办单位:
[Luo Changri;Xie, Yaohui] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Sch Vocat & Continuing Educ, Wuhan, Hubei, Peoples R China.^[Zhang Xinhua] Wuhan Vocat Coll Software & Engn, Sch Comp Sci, Wuhan, Hubei, Peoples R China.^[He Tingting] Cent China Normal Univ, Acad Comp Sci, Wuhan, Hubei, Peoples R China.^[Huang Baohua] Cent China Normal Univ, Sch Vocat & Continuing Educ, Wuhan, Hubei, Peoples R China.^[Wu, Shaojing] Cent China Normal Univ, Sch Informat Management, Natl Engn Res Ctr E Learning, Wuhan, Hubei, Peoples R China.
关键词:
two-stage clustering;online learning;learning collaboration group
摘要:
Many studies have confirmed that the role of online learning collaboration group is very important. For large-scale online learning, how to effectively find the learning collaboration group is a difficult problem. The online learning forum is the main place for learners to learn and communicate, so it is the main venue for learning collaboration groups to implement collaborative learning. In the implementation process of learning collaboration, there are two characteristics between learning team members. First, there is interaction between them. Second, the contents of their discussion have high relevance. In this paper, the study uses these two important characteristics and carries out two-stage clustering algorithm based on the interaction structure and interactive contents of learners to find the potential learning collaboration groups in the large-scale online learning forum. The experimental results show that the method proposed in this paper is effective. This has practical significance for large-scale online learning support services.
作者:
Zhou, Guangyou*;Xie, Zhiwen;He, Tingting(何婷婷);Zhao, Jun;Hu, Xiaohua Tony
期刊:
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,2016年24(7):1305-1314 ISSN:2329-9290
通讯作者:
Zhou, Guangyou
作者机构:
[He, Tingting; Zhou, Guangyou; Zhao, Jun; Xie, Zhiwen; Hu, Xiaohua Tony] Cent China Normal Univ, Sch Comp Sci, Wuhan 430079, Peoples R China.;[Zhao, Jun] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100080, Peoples R China.;[Hu, Xiaohua Tony] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
通讯机构:
[Zhou, Guangyou] C;Cent China Normal Univ, Sch Comp Sci, Wuhan 430079, Peoples R China.
关键词:
Community Question Answering;Information Retrieval;Natural Language Processing;Question Retrieval;Text Mining
摘要:
Community question answering (CQA) has become an increasingly popular research topic. In this paper, we focus on the problem of question retrieval. Question retrieval in CQA can automatically find the most relevant and recent questions that have been solved by other users. However, the word ambiguity and word mismatch problems bring about new challenges for question retrieval in CQA. State-of-the-art approaches address these issues by implicitly expanding the queried questions with additional words or phrases using monolingual translation models. While useful, the effectiveness of these models is highly dependent on the availability of quality parallel monolingual corpora (e.g., question-answer pairs) in the absence of which they are troubled by noise issues. In this work, we propose an alternative way to address the word ambiguity and word mismatch problems by taking advantage of potentially rich semantic information drawn from other languages. Our proposed method employs statistical machine translation to improve question retrieval and enriches the question representation with the translated words from other languages via non-negative matrix factorization. Experiments conducted on real CQA data sets show that our proposed approach is promising.
作者机构:
[Jiang, Xingpeng; He, Tingting; Hu, Xiaohua] Cent China Normal Univ, Sch Comp Sci, Wuhan 430079, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
通讯机构:
[He, Tingting] C;Cent China Normal Univ, Sch Comp Sci, Wuhan 430079, Peoples R China.
关键词:
microbiome;information distance;data visualization;density clustering;microbial community
摘要:
Clustering technology is a method for grouping data points into clusters containing a group of similar data points. In a real dataset such as microbiome data, the data points are presented as profiles or a probability distribution. These data points form the periphery of a cluster, making it difficult to identify the real clustering structure. In this study, we used density clustering on several distance measures to overcome this difficulty. Experiments using a real dataset indicated that the Manhattan distance is an appropriate distance measure for clustering analysis of microbiome data.
摘要:
Most of the existing information retrieval models assume that the terms of a text document are independent of each other. These retrieval models integrate three major variables to determine the degree of importance of a term for a document: within document term frequency, document length and the specificity of the term in the collection. Intuitively, the importance of a term for a document is not only dependent on the three aspects mentioned above, but also dependent on the degree of semantic coherence between the term and the document. In this paper, we propose a heuristic approach, in which the degree of semantic coherence of the query terms with a document is adopted to improve the information retrieval performance. Experimental results on standard TREC collections show the proposed models consistently outperform the state-of-the-art models.
摘要:
Nonnegative matrix factorization (NMF) has received considerable attention due to its interpretation of observed samples as combinations of different components, and has been successfully used as a clustering method. As an extension of NMF, Symmetric NMF (SNMF) inherits the advantages of NMF. Unlike NMF, however, SNMF takes a nonnegative similarity matrix as an input, and two lower rank nonnegative matrices (H, H-T) are computed as an output to approximate the original similarity matrix. Laplacian regularization has improved the clustering performance of NMF and SNMF. However, Laplacian regularization (LR), as a classic manifold regularization method, suffers some problems because of its weak extrapolating ability. In this paper, we propose a novel variant of SNMF, called Hessian regularization based symmetric nonnegative matrix factorization (HSNMF), for this purpose. In contrast to Laplacian regularization, Hessian regularization fits the data perfectly and extrapolates nicely to unseen data. We conduct extensive experiments on several datasets including text data, gene expression data and HMP (Human Microbiome Project) data. The results show that the proposed method outperforms other methods, which suggests the potential application of HSNMF in biological data clustering. (C) 2016 Published by Elsevier Inc.
期刊:
Lecture Notes in Computer Science,2016年10102:300-311 ISSN:0302-9743
通讯作者:
Zhou, Guangyou
作者机构:
[He, Tingting; Zeng, Zhao; Zhou, Guangyou; Xie, Zhiwen] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
通讯机构:
[Zhou, Guangyou] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
会议名称:
第五届自然语言处理与中文计算会议(NLPCC-ICCPOL2016)
会议时间:
2016-12-02
会议地点:
昆明
会议主办单位:
Kunming Univ Sci & Technol
会议论文集名称:
第五届自然语言处理与中文计算会议(NLPCC-ICCPOL2016)论文集
摘要:
This paper focuses on the task of knowledge-based question answering (KBQA). KBQA aims to match the questions with the structured semantics in knowledge base. In this paper, we propose a two-stage method. Firstly, we propose a topic entity extraction model (TEEM) to extract topic entities in questions, which does not rely on hand-crafted features or linguistic tools. We extract topic entities in questions with the TEEM and then search the knowledge triples which are related to the topic entities from the knowledge base as the candidate knowledge triples. Then, we apply Deep Structured Semantic Models based on convolutional neural network and bidirectional long short-term memory to match questions and predicates in the candidate knowledge triples. To obtain better training dataset, we use an iterative approach to retrieve the knowledge triples from the knowledge base. The evaluation result shows that our system achieves an \(\text {Average} F_1\) measure of 79.57% on test dataset.
作者:
Zhou, Guangyou*;Zhu, Zhiyuan;He, Tingting(何婷婷);Hu, Xiaohua Tony
期刊:
Knowledge and Information Systems,2016年47(1):27-44 ISSN:0219-1377
通讯作者:
Zhou, Guangyou
作者机构:
[Zhou, Guangyou; Hu, Xiaohua Tony] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;[He, Tingting] Cent China Normal Univ, Sch Comp, Nat Language Proc Lab, Wuhan 430079, Peoples R China.;[Zhu, Zhiyuan] Chinese Inst Elect, Beijing 100036, Peoples R China.;[Hu, Xiaohua Tony] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
通讯机构:
[Zhou, Guangyou] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
作者机构:
[He, Tingting; Zhou, Guangyou; Zhou, Yin] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;[Wu, Wensheng] Univ So Calif, Dept Comp Sci, Los Angeles, CA 90089 USA.
通讯机构:
[Zhou, Guangyou] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
关键词:
Community question answering;Question retrieval;Text mining;Yahoo! Answers
摘要:
Learning the semantic representation using neural network architecture.The neural network is trained via pre-training and fine-tuning phase.The learned semantic level feature is incorporated into a LTR framework. In community question answering (cQA), users pose queries (or questions) on portals like Yahoo! Answers which can then be answered by other users who are often knowledgeable on the subject. cQA is increasingly popular on the Web, due to its convenience and effectiveness in connecting users with queries and those with answers. In this article, we study the problem of finding previous queries (e.g., posed by other users) which may be similar to new queries, and adapting their answers as the answers to the new queries. A key challenge here is to the bridge the lexical gap between new queries and old answers. For example, "company" in the queries may correspond to "firm" in the answers. To address this challenge, past research has proposed techniques similar to machine translation that "translate" old answers to ones using the words in the new queries. However, a key limitation of these works is that they assume queries and answers are parallel texts, which is hardly true in reality. As a result, the translated or rephrased answers may not look intuitive.In this article, we propose a novel approach to learn the semantic representation of queries and answers by using a neural network architecture. The learned semantic level features are finally incorporated into a learning to rank framework. We have evaluated our approach using a large-scale data set. Results show that the approach can significantly outperform existing approaches. Learning the semantic representation using neural network architecture.The neural network is trained via pre-training and fine-tuning phase.The learned semantic level feature is incorporated into a LTR framework. In community question answering (cQA), users pose queries (or questions) on portals like Yahoo! Answers which can then be answered by other users who are often knowledgeable on the subject. cQA is increasingly popular on the Web, due to its convenience and effectiveness in connecting users with queries and those with answers. In this article, we study the problem of finding previous queries (e.g., posed by other users) which may be similar to new queries, and adapting their answers as the answers to the new queries. A key challenge here is to the bridge the lexical gap between new queries and old answers. For example, "company" in the queries may correspond to "firm" in the answers. To address this challenge, past research has proposed techniques similar to machine translation that "translate" old answers to ones using the words in the new queries. However, a key limitation of these works is that they assume queries and answers are parallel texts, which is hardly true in reality. As a result, the translated or rephrased answers may not look intuitive.In this article, we propose a novel approach to learn the semantic representation of queries and answers by using a neural network architecture. The learned semantic level features are finally incorporated into a learning to rank framework. We have evaluated our approach using a large-scale data set. Results show that the approach can significantly outperform existing approaches.
作者机构:
[Jiang, Xingpeng; Yang, Jincai; He, Tingting; Shen, Xianjun; Hu, Xiaohua; Yi, Li; Zhao, Yanli] Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
通讯机构:
[Yang, Jincai] C;Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.
关键词:
*Clustering coefficient;*Neighbor affinity;*Temporal protein complex;*Time course protein interaction networks
摘要:
Detection of temporal protein complexes would be a great aid in furthering our knowledge of the dynamic features and molecular mechanism in cell life activities. Most existing clustering algorithms for discovering protein complexes are based on static protein interaction networks in which the inherent dynamics are often overlooked. We propose a novel algorithm DPC-NADPIN (Discovering Protein Complexes based on Neighbor Affinity and Dynamic Protein Interaction Network) to identify temporal protein complexes from the time course protein interaction networks. Inspired by the idea of that the tighter a protein’s neighbors inside a module connect, the greater the possibility that the protein belongs to the module, DPC-NADPIN algorithm first chooses each of the proteins with high clustering coefficient and its neighbors to consolidate into an initial cluster, and then the initial cluster becomes a protein complex by appending its neighbor proteins according to the relationship between the affinity among neighbors inside the cluster and that outside the cluster. In our experiments, DPC-NADPIN algorithm is proved to be reasonable and it has better performance on discovering protein complexes than the following state-of-the-art algorithms: Hunter, MCODE, CFinder, SPICI, and ClusterONE; Meanwhile, it obtains many protein complexes with strong biological significance, which provide helpful biological knowledge to the related researchers. Moreover, we find that proteins are assembled coordinately to form protein complexes with characteristics of temporality and spatiality, thereby performing specific biological functions.
期刊:
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS,2016年15(2):125-144 ISSN:1748-5673
通讯作者:
Guo, Xiyue
作者机构:
[Guo, Xiyue] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;[Guo, Xiyue] Xingyi Normal Univ Nationalities, Sch Informat Technol, Xingyi, Peoples R China.;[He, Tingting] Cent China Normal Univ, Sch Comp, Nat Language Proc Lab, Wuhan, Peoples R China.;[Xing, Ying] Zhongyuan Univ Technol, Software Coll, Zhengzhou, Peoples R China.
通讯机构:
[Guo, Xiyue] C;[Guo, Xiyue] X;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;Xingyi Normal Univ Nationalities, Sch Informat Technol, Xingyi, Peoples R China.
关键词:
PPI extraction;weakly supervised;word dictionary construction;rule learning
摘要:
Each method, machine learning-based and rule-based, for extracting PPI (Protein-Protein Interactions) from biomedical literatures has advantages and disadvantages. In order to utilise the superiorities of these methods reasonably, this paper designs a new structure for the relational word dictionary, uses weakly supervised method to find dictionary items and fill them into the PPI relational word dictionary, and presents a method to learn PPI relational rules automatically based on slot-filling principle. Moreover, this method takes the PPI relation instances without apparent relational words into consideration aiming to improve the final performance. We conduct the experiments with five authoritative biomedical PPI corpuses, and discover some distribution features about PPI relational words. Finally, we also compare our method with several recent research achievements, and the results show that the performance of our method is better than the average level among these methods.
摘要:
Secret sharing (SS) is one of the most important cryptographic primitives used for data outsourcing. The (t, n,) SS was introduced by Shamir and Blakley separately in 1979. The secret sharing policy of the (t, n) threshold SS is far too simple for many applications because it assumes that every shareholder has equal privilege to the secret or every shareholder is equally trusted. Ito et al. introduced the concept of a general secret sharing scheme (GSS). In a GSS, a secret is divided among a set of shareholders in such a way that any "qualified" subset of shareholders can access the secret, but any "unqualified" subset of shareholders cannot access the secret. The secret access structure of GSS is far more flexible than threshold SS. In this paper, we propose an optimized implementation of GSS. Our proposed scheme first uses Boolean logic to derive two important subsets, one is called Min which is the minimal positive access subset and the other is called Max which is the maximal negative access subset, of a given general secret sharing structure. Then, conditions of parameters of a GSS are established based on these two important subsets. Furthermore, integer linear/non-linear programming is used to optimize the size of shares of a GSS. The complexity of linear/non-linear programming is O(n), where n is the number of shares generated by the dealer. This proposed design can be applied to implement GSS based on any classical SS. However, our proposed method is limited to be applicable to some general secret sharing policies. We use two GSSs, one is based on Shamir's weighted SS (WSS) using linear polynomial and the other is based on Asmuth-Bloom's SS using Chinese Remainder Theorem (CRT), to demonstrate our design. In comparing with existing GSSs, our proposed scheme is more efficient and can be applied to all classical SSs. (C) 2016 Elsevier Inc. All rights reserved.
摘要:
The identification of modules in complex networks is important for the understanding of biological systems. Recent studies have shown those modules can be identified from the protein interaction network, what's more, the modules has not only relatively high density, but also has high coefficient of affinity. In this paper, we propose a novel algorithm based on Connected Affinity and Multi-level Seed Extension (CAMSE). First, CAMSE integrates Protein Interactions (PPI) with the protein Connected Coefficient (CC) inferred from protein complexes collected in the MIPS database to enhance the modularisation and biological character. Then we complete the seed selection, inner kernel extensions and outer extension to get core candidate function modules step by step. Finally, we integrated the modules with high repeat rate. The experimental results show that CAMSE can detect the functional modules much more effectively and accurately when it compared with other state-of-art algorithms CPM, CACE and IPC-MCE.
期刊:
Lecture Notes in Computer Science,2016年9544:127-140 ISSN:0302-9743
通讯作者:
He, Tingting(tthe@mail.ccnu.edu.cn)
作者机构:
[Guo, Xiyue] National Engineering Research Center for E-learning, Central China Normal University, Wuhan, China;[Guo, Xiyue] School of Information Technology, Xingyi Normal University for Nationalities, Xingyi, China;[He, Tingting] School of Computer, Central China Normal University, Wuhan, China
作者:
Jian, Fanghong;Huang, Jimmy Xiangji*;Zhao, Jiashu;He, Tingting(何婷婷);Hu, Po
期刊:
SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL,2016年:733-736
通讯作者:
Huang, Jimmy Xiangji
作者机构:
[Jian, Fanghong] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Hubei, Peoples R China.;[Huang, Jimmy Xiangji] Cent China Normal Univ, Informat Retrieval & Knowledge Management Res Lab, Wuhan, Hubei, Peoples R China.;[He, Tingting; Hu, Po] Cent China Normal Univ, Sch Comp Sci, Wuhan, Hubei, Peoples R China.;[Zhao, Jiashu] York Univ, Sch Informat Technol, Toronto, ON, Canada.
通讯机构:
[Huang, Jimmy Xiangji] C;Cent China Normal Univ, Informat Retrieval & Knowledge Management Res Lab, Wuhan, Hubei, Peoples R China.
会议名称:
39th International ACM SIGIR conference on Research and Development in Information Retrieval
会议时间:
JUL 17-21, 2016
会议地点:
Pisa, ITALY
会议主办单位:
[Huang, Jimmy Xiangji] Cent China Normal Univ, Informat Retrieval & Knowledge Management Res Lab, Wuhan, Hubei, Peoples R China.^[Jian, Fanghong] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Hubei, Peoples R China.^[He, Tingting;Hu, Po] Cent China Normal Univ, Sch Comp Sci, Wuhan, Hubei, Peoples R China.^[Zhao, Jiashu] York Univ, Sch Informat Technol, Toronto, ON, Canada.
关键词:
Probabilistic Model;Dirichlet Language Model;LDA
摘要:
Traditional information retrieval (IR) models, in which a document is normally represented as a bag of words and their frequencies, capture the term-level and document-level information. Topic models, on the other hand, discover semantic topic-based information among words. In this paper, we consider term-based information and semantic information as two features of query terms and propose a simple enhancement for ad-hoc IR via topic modeling. In particular, three topic-based hybrid models, LDA-BM25, LDA-MATF and LDA-LM, are proposed. A series of experiments on eight standard datasets show that our proposed models can always outperform significantly the corresponding strong baselines over all datasets in terms of MAP and most of datasets in terms of P@5 and P@20. A direct comparison on eight standard datasets also indicates our proposed models are at least comparable to the state-of-the-art approaches.
作者机构:
[Jiang, Xingpeng; Yang, Jincai; He, Tingting; Shen, Xianjun; Hu, Xiaohua; Yi, Li] Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
通讯机构:
[Yang, Jincai] C;Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.
关键词:
Protein complexes;Protein interactions;Protein interaction networks;Gene expression;Algorithms;Molecular evolution;Protein expression;Protein metabolism
摘要:
The identification of temporal protein complexes would make great contribution to our knowledge of the dynamic organization characteristics in protein interaction networks (PINs). Recent studies have focused on integrating gene expression data into static PIN to construct dynamic PIN which reveals the dynamic evolutionary procedure of protein interactions, but they fail in practice for recognizing the active time points of proteins with low or high expression levels. We construct a Time-Evolving PIN (TEPIN) with a novel method called Deviation Degree, which is designed to identify the active time points of proteins based on the deviation degree of their own expression values. Owing to the differences between protein interactions, moreover, we weight TEPIN with connected affinity and gene co-expression to quantify the degree of these interactions. To validate the efficiencies of our methods, ClusterONE, CAMSE and MCL algorithms are applied on the TEPIN, DPIN (a dynamic PIN constructed with state-of-the-art three-sigma method) and SPIN (the original static PIN) to detect temporal protein complexes. Each algorithm on our TEPIN outperforms that on other networks in terms of match degree, sensitivity, specificity, F-measure and function enrichment etc. In conclusion, our Deviation Degree method successfully eliminates the disadvantages which exist in the previous state-of-the-art dynamic PIN construction methods. Moreover, the biological nature of protein interactions can be well described in our weighted network. Weighted TEPIN is a useful approach for detecting temporal protein complexes and revealing the dynamic protein assembly process for cellular organization.