期刊:
BRIEFINGS IN BIOINFORMATICS,2024年25(3) ISSN:1467-5463
通讯作者:
Xingpeng Jiang<&wdkj&>Cuihong Wan
作者机构:
[Wan, Cuihong; Peng, Zhao] School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, Hubei, People's Republic of China;[Li, Jiaqiang; Jiang, Xingpeng] School of Computer Science, and Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, Hubei, People's Republic of China
通讯机构:
[Xingpeng Jiang; Cuihong Wan] S;School of Computer Science , and Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University , Wuhan 430079, Hubei , People’s Republic of China<&wdkj&>School of Life Sciences , and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University , Wuhan 430079, Hubei , People’s Republic of China
摘要:
Small open reading frames (smORFs) have been acknowledged to play various roles on essential biological pathways and affect human beings from diabetes to tumorigenesis. Predicting smORFs in silico is quite a prerequisite for processing the omics data. Here, we proposed the smORF-coding-potential-predicting framework, sOCP, which provides functions to construct a model for predicting novel smORFs in some species. The sOCP model constructed in human was based on in-frame features and the nucleotide bias around the start codon, and the small feature subset was proved to be competent enough and avoid overfitting problems for complicated models. It showed more advanced prediction metrics than previous methods and could correlate closely with experimental evidence in a heterogeneous dataset. The model was applied to Rattus norvegicus and exhibited satisfactory performance. We then scanned smORFs with ATG and non-ATG start codons from the human genome and generated a database containing about a million novel smORFs with coding potential. Around 72000 smORFs are located on the lncRNA regions of the genome. The smORF-encoded peptides may be involved in biological pathways rare for canonical proteins, including glucocorticoid catabolic process and the prokaryotic defense system. Our work provides a model and database for human smORF investigation and a convenient tool for further smORF prediction in other species.
期刊:
IEEE/ACM Transactions on Computational Biology and Bioinformatics,2024年21(1):120-128 ISSN:1545-5963
通讯作者:
Shen, XJ
作者机构:
[Shen, Xianjun; Xiao, Zhen; Zhao, Weizhong; Shen, XJ; Jiang, Xingpeng; Sun, Han; Li, Dandan] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
通讯机构:
[Shen, XJ ] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
关键词:
Drugs;Diseases;Proteins;Heterogeneous networks;Kernel;Semantics;Matrix decomposition;Drug repositioning;drug-disease association prediction;heterogeneous networks;graph attention model;multi-kernel deep learning
摘要:
Computational drug repositioning can identify potential associations between drugs and diseases. This technology has been shown to be effective in accelerating drug development and reducing experimental costs. Although there has been plenty of research for this task, existing methods are deficient in utilizing complex relationships among biological entities, which may not be conducive to subsequent simulation of drug treatment processes. In this article, we propose a heterogeneous graph embedding method called HMLKGAT to infer novel potential drugs for diseases. More specifically, we first construct a heterogeneous information network by combining drug-disease, drug-protein and disease-protein biological networks. Then, a multi-layer graph attention model is utilized to capture the complex associations in the network to derive representations for drugs and diseases. Finally, to maintain the relationship of nodes in different feature spaces, we propose a multi-kernel learning method to transform and combine the representations. Experimental results demonstrate that HMLKGAT outperforms six state-of-the-art methods in drug-related disease prediction, and case studies of five classical drugs further demonstrate the effectiveness of HMLKGAT.
作者机构:
[Zhao, Weizhong; He, Tingting; Jiang, Xingpeng; Wu, Junze] Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, Hubei 430079, P.R. China;[Zhao, Weizhong] School of Computer, Central China Normal University, Wuhan, Hubei 430079, P.R. China;[Zhao, Weizhong] National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, Wuhan, Hubei 430079, P.R. China;[Hu, Xiaohua] College of Computing & Informatics, Drexel University, Philadelphia, PA 19104, United States
通讯机构:
[Weizhong Zhao; Xingpeng Jiang] H;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University , Wuhan, Hubei 430079, P.R. China<&wdkj&>Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University , Wuhan, Hubei 430079, P.R. China<&wdkj&>School of Computer, Central China Normal University , Wuhan, Hubei 430079, P.R. China<&wdkj&>National Language Resources Monitoring & Research Center for Network Media, Central China Normal University , Wuhan, Hubei 430079, P.R. China
摘要:
MOTIVATION: The crisis of antibiotic resistance, which causes antibiotics used to treat bacterial infections to become less effective, has emerged as one of the foremost challenges to public health. Identifying the properties of antibiotic resistance genes (ARGs) is an essential way to mitigate this issue. Although numerous methods have been proposed for this task, most of these approaches concentrate solely on predicting antibiotic class, disregarding other important properties of ARGs. In addition, existing methods for simultaneously predicting multiple properties of ARGs fail to account for the causal relationships among these properties, limiting the predictive performance. RESULTS: In this study, we propose a causality-guided framework for annotating properties of ARGs, in which causal inference is utilized for representation learning. More specifically, the hidden biological patterns determining the properties of ARGs are described by a Gaussian Mixture Model, and procedure of causal representation learning is used to derive the hidden features. In addition, a causal graph among different properties is constructed to capture the causal relationships among properties of ARGs, which is integrated into the task of annotating properties of ARGs. The experimental results on a real-world dataset demonstrate the effectiveness of the proposed framework on the task of annotating properties of ARGs. AVAILABILITY AND IMPLEMENTATION: The data and source codes are available in GitHub at https://github.com/David-WZhao/CausalARG.
期刊:
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS,2023年PP:1-12 ISSN:2168-2194
作者机构:
[Xueli Pan; Frank van Harmelen] Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China;National Language Resources Monitor Research Center for Network Media, Central China Normal University, Wuhan, China;School of Computer Science, Central China Normal University, Wuhan, China
摘要:
It is commonly known that food nutrition is closely related to human health. The complex interactions between food nutrients and diseases, influenced by gut microbial metabolism, present challenges in systematizing and practically applying knowledge. To address this, we propose a method for extracting triples from a vast amount of literature, which is used to construct a comprehensive knowledge graph on nutrition and human health. Concurrently, we develop a query-based question answering system over our knowledge graph, proficiently addressing three types of questions. The results show that our proposed model outperforms other state-of-art methods, achieving a precision of 0.92, a recall of 0.81, and an F1 score of 0.86in the nutrition and disease relation extraction task. Meanwhile, our question answering system achieves an accuracy of 0.68 and an F1 score of 0.61 on our benchmark dataset, showcasing competitiveness in practical scenarios. Furthermore, we design five independent experiments to assess the quality of the data structure in the knowledge graph, ensuring results characterized by high accuracy and interpretability. In conclusion, the construction of our knowledge graph shows significant promise in facilitating diet recommendations, enhancing patient care applications, and informing decision-making in clinical research.
期刊:
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS,2023年27(6):3061-3071 ISSN:2168-2194
通讯作者:
Zhao, Weizhong;Shen, XJ
作者机构:
[Shen, Xianjun; Wang, Haodong; Wang, Yue; Zhao, Weizhong; Zhao, WZ; Shen, XJ; Jiang, Xingpeng; Li, Dandan] Cent China Normal Univ, Sch Comp, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;[Sun, Han] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Peoples R China.;[Shen, Xianjun; Wang, Haodong; Wang, Yue; Zhao, Weizhong; Zhao, WZ; Shen, XJ; Jiang, Xingpeng; Li, Dandan] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Peoples R China.
通讯机构:
[Zhao, WZ; Shen, XJ ] C;Cent China Normal Univ, Sch Comp, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Peoples R China.
关键词:
graph representation learning;heterogeneous information network;multi-head attention mechanism;Phage-host interactions prediction
摘要:
In the treatment of bacterial infectious diseases, overuse of antibiotics may lead to not only bacterial resistance to antibiotics but also dysbiosis of beneficial bacteria which are essential for maintaining normal human life activities. Instead, phage therapy, which invades and lyses specific pathogenic bacteria without affecting beneficial bacteria, becomes more and more popular to treat bacterial infectious diseases. For the effective phage therapy, it requires to accurately predict potential phage-host interactions from heterogeneous information network consisting of bacteria and phages. Although many models have been proposed for predicting phage-host interactions, most methods fail to consider fully the sparsity and unconnectedness of phage-host heterogeneous information network, deriving the undesirable performance on phage-host interactions prediction. To address the challenge, we propose an effective model called GERMAN-PHI for predicting Phage-Host Interactions via Graph Embedding Representation learning with Multi-head Attention mechaNism. In GERMAN-PHI, the multi-head attention mechanism is utilized to learn representations of phages and hosts from multiple perspectives of phage-host associations, addressing the sparsity and unconnectedness in phage-host heterogeneous information network. More specifically, a module of GAT with talking-heads is employed to learn representations of phages and bacteria, on which neural induction matrix completion is conducted to reconstruct the phage-host association matrix. Results of comprehensive experiments demonstrate that GERMAN-PHI performs better than the state-of-the-art methods on phage-host interactions prediction. In addition, results of case study for two high-risk human pathogens show that GERMAN-PHI can predict validated phages with high accuracy, and some potential or new associated phages are provided as well.
期刊:
BRIEFINGS IN BIOINFORMATICS,2023年24(2) ISSN:1467-5463
通讯作者:
Weizhong Zhao
作者机构:
[Shen, Xianjun; Zhao, Weizhong; Yuan, Xueling; He, Tingting; Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.;[Shi, Chuan] Beijing Univ Posts & Telecommun, Sch Comp Sci, Beijing, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
通讯机构:
[Weizhong Zhao] H;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University , Wuhan, Hubei 430079, P R China<&wdkj&>School of Computer Science, Beijing University of Posts and Telecommunications , Beijing, 100876, P R China<&wdkj&>National Language Resources Monitoring & Research Center for Network Media, Central China Normal University , Wuhan, Hubei 430079, P R China
关键词:
drug–drug interaction;heterogeneous information network;meta-path-based information fusion
摘要:
Drug-drug interactions (DDIs) are compound effects when patients take two or more drugs at the same time, which may weaken the efficacy of drugs or cause unexpected side effects. Thus, accurately predicting DDIs is of great significance for the drug development and the drug safety surveillance. Although many methods have been proposed for the task, the biological knowledge related to DDIs is not fully utilized and the complex semantics among drug-related biological entities are not effectively captured in existing methods, leading to suboptimal performance. Moreover, the lack of interpretability for the predicted results also limits the wide application of existing methods for DDIs prediction. In this study, we propose a novel framework for predicting DDIs with interpretability. Specifically, we construct a heterogeneous information network (HIN) by explicitly utilizing the biological knowledge related to the procedure of inducing DDIs. To capture the complex semantics in HIN, a meta-path-based information fusion mechanism is proposed to learn high-quality representations of drugs. In addition, an attention mechanism is designed to combine semantic information obtained from meta-paths with different lengths to obtain final representations of drugs for DDIs prediction. Comprehensive experiments are conducted on 2410 approved drugs, and the results of predictive performance comparison show that our proposed framework outperforms selected representative baselines on the task of DDIs prediction. The results of ablation study and cold-start scenario indicate that the meta-path-based information fusion mechanism red is beneficial for capturing the complex semantics among drug-related biological entities. Moreover, the results of case study demonstrate that the designed attention mechanism is able to provide partial interpretability for the predicted DDIs. Therefore, the proposed method will be a feasible solution to the task of predicting DDIs.
期刊:
Artificial Intelligence in Medicine,2023年145:102677 ISSN:0933-3657
通讯作者:
Jiang, XP
作者机构:
[Fu, Chengcheng] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;[Jiang, Xingpeng; Fu, Chengcheng; He, Tingting] Cent China Normal Univ, Sch Comp Sci, Wuhan, Peoples R China.;[Fu, Chengcheng; van Harmelen, Frank; Huang, Zhisheng] Vrije Univ Amsterdam, Dept Comp Sci, Amsterdam, Netherlands.;[Fu, Chengcheng; He, Tingting; Jiang, Xingpeng] Cent China Normal Univ, Natl Language Resources Monitor Res Ctr Network Me, Wuhan, Peoples R China.;[Huang, Zhisheng] Tongji Univ, Sch Med, Clin Res Ctr Mental Disorders, Shanghai Pudong New Area Mental Hlth Ctr, Shanghai, Peoples R China.
通讯机构:
[Jiang, XP ] C;Cent China Normal Univ, Sch Comp Sci, Wuhan, Peoples R China.
关键词:
Food;Gut microbiota;Knowledge graph;Mental health
期刊:
BRIEFINGS IN BIOINFORMATICS,2023年24(2) ISSN:1467-5463
通讯作者:
Xingpeng Jiang
作者机构:
[Wang, Haodong; Wang, Yue; Xiao, Zhen; Huang, Xiaoyun; He, Tingting; Jiang, Xingpeng; Sun, Han] Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China;[Wang, Haodong; Wang, Yue; Xiao, Zhen; Huang, Xiaoyun; He, Tingting; Jiang, Xingpeng; Sun, Han] School of Computer Science, Central China Normal University, Wuhan 430079, China;[Xiao, Zhen; Sun, Han] School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China;[Huang, Xiaoyun] Collaborative & Innovative Center for Educational Technology, Central China Normal University, Wuhan 430079, China;[He, Tingting; Jiang, Xingpeng] National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, Wuhan 430079, China
通讯机构:
[Xingpeng Jiang] H;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University , Wuhan 430079 , China<&wdkj&>School of Computer Science, Central China Normal University , Wuhan 430079 , China<&wdkj&>National Language Resources Monitoring & Research Center for Network Media, Central China Normal University , Wuhan 430079 , China
关键词:
Kernel machine regression;Microbiome-based association test;Multinomial logit model;Ordinal/Nominal multicategory phenotypes
摘要:
Microbes can affect the metabolism and immunity of human body incessantly, and the dysbiosis of human microbiome drives not only the occurrence but also the progression of disease (i.e. multiple statuses of disease). Recently, microbiome-based association tests have been widely developed to detect the association between the microbiome and host phenotype. However, the existing methods have not achieved satisfactory performance in testing the association between the microbiome and ordinal/nominal multicategory phenotypes (e.g. disease severity and tumor subtype). In this paper, we propose an optimal microbiome-based association test for multicategory phenotypes, namely, multiMiAT. Specifically, under the multinomial logit model framework, we first introduce a microbiome regression-based kernel association test for multicategory phenotypes (multiMiRKAT). As a data-driven optimal test, multiMiAT then integrates multiMiRKAT, score test and MiRKAT-MC to maintain excellent performance in diverse association patterns. Massive simulation experiments prove the success of our method. Furthermore, multiMiAT is also applied to real microbiome data experiments to detect the association between the gut microbiome and clinical statuses of colorectal cancer as well as for diverse statuses of Clostridium difficile infections.
期刊:
IEEE/ACM Transactions on Computational Biology and Bioinformatics,2023年20(6):3635–3647 ISSN:1545-5963
作者机构:
[Xiaohua Hu] College of Computing & Informatics, Drexel University, Philadelphia, PA, USA;[Chuan Shi] School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China;[Wenjie Yao; Xingpeng Jiang; Tingting He] School of Computer, Central China Normal University, Wuhan, Hubei, China;[Weizhong Zhao] Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, School of Computer, and National Language Resources Monitoring and Research Center for Network Media, Central China Normal University, Wuhan, Hubei, China
摘要:
Side effects of drugs have gained increasing attention in the biomedical field, and accurate identification of drug side effects is essential for drug development and drug safety surveillance. Although the traditional pharmacological experiments can accurately detect the side effects of drugs, the identifying process is time-consuming, costly, and may lead to incomplete identification of side effects. With the expanding of various biomedical databases, many computational methods have been developed for the task of drug-side effect associations (DSAs) prediction. However, existing methods have the following three drawbacks: 1). multiple drug-related databases are not fully used; 2). the complex semantics among drugs and side effects are not effectively captured; 3). the explainability of the predicted DSAs is missed for most existing methods. Therefore, there is an urgent need to find a more effective method for predicting DSAs. To address these issues, we propose a novel meta-path-based graph neural network model for drug-side effect associations prediction (MPGNN-DSA). In MPGNN-DSA, a heterogeneous information network is first constructed by combining multiple biological datasets. Then, a meta-path-based feature learning module is utilized for learning high-quality representations of drugs and side effects by capturing the semantics contained in meta-paths of the constructed HIN. With the learned features, the prediction module is conducted to derive the predicted side effects for drugs. In addition, the explainability of the predicted DSAs can be provided as well with the semantics contained in meta-paths. We conduct comprehensive experiments, and the results demonstrate the effectiveness of MPGNN-DSA, suggesting that the proposed method will be a feasible solution to the task of DSAs prediction. Side effects of drugs have gained increasing attention in the biomedical field, and accurate identification of drug side effects is essential for drug development and drug safety surveillance. Although the traditional pharmacological experiments can accurately detect the side effects of drugs, the identifying process is time-consuming, costly, and may lead to incomplete identification of side effects. With the expanding of various biomedical databases, many computational methods have been developed for the task of drug-side effect associations (DSAs) prediction. However, existing methods have the following three drawbacks: 1). multiple drug-related databases are not fully used; 2). the complex semantics among drugs and side effects are not effectively captured; 3). the explainability of the predicted DSAs is missed for most existing methods. Therefore, there is an urgent need to find a more effective method for predicting DSAs. To address these issues, we propose a novel meta-path-based graph neural network model for drug-side effect associations prediction (MPGNN-DSA). In MPGNN-DSA, a heterogeneous information network is first constructed by combining multiple biological datasets. Then, a meta-path-based feature learning module is utilized for learning high-quality representations of drugs and side effects by capturing the semantics contained in meta-paths of the constructed HIN. With the learned features, the prediction module is conducted to derive the predicted side effects for drugs. In addition, the explainability of the predicted DSAs can be provided as well with the semantics contained in meta-paths. We conduct comprehensive experiments, and the results demonstrate the effectiveness of MPGNN-DSA, suggesting that the proposed method will be a feasible solution to the task of DSAs prediction.
作者:
Ye, Shengwei;Zhao, Weizhong;Shen, Xianjun;Jiang, Xingpeng;He, Tingting
期刊:
Methods,2023年218:48-56 ISSN:1046-2023
通讯作者:
Zhao, WZ
作者机构:
[Shen, Xianjun; Zhao, Weizhong; Zhao, WZ; He, Tingting; Jiang, Xingpeng; Ye, Shengwei] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Hubei, Peoples R China.;[Shen, Xianjun; Zhao, Weizhong; He, Tingting; Jiang, Xingpeng; Ye, Shengwei] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Shen, Xianjun; Zhao, Weizhong; He, Tingting; Jiang, Xingpeng; Ye, Shengwei] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Zhao, WZ ] C;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Hubei, Peoples R China.
关键词:
Drug repurposing;Drug-disease associations prediction;Graph convolutional network;Heterogeneous information network;Multi-task learning
摘要:
Drug repurposing, which typically applies the procedure of drug-disease associations (DDAs) prediction, is a feasible solution to drug discovery. Compared with traditional methods, drug repurposing can reduce the cost and time for drug development and advance the success rate of drug discovery. Although many methods for drug repurposing have been proposed and the obtained results are relatively acceptable, there is still some room for improving the predictive performance, since those methods fail to consider fully the issue of sparseness in known drug-disease associations. In this paper, we propose a novel multi-task learning framework based on graph representation learning to identify DDAs for drug repurposing. In our proposed framework, a heterogeneous information network is first constructed by combining multiple biological datasets. Then, a module consisting of multiple layers of graph convolutional networks is utilized to learn low-dimensional representations of nodes in the constructed heterogeneous information network. Finally, two types of auxiliary tasks are designed to help to train the target task of DDAs prediction in the multi-task learning framework. Comprehensive experiments are conducted on real data and the results demonstrate the effectiveness of the proposed method for drug repurposing.
作者机构:
[Wan, Cuihong; Peng, Zhao] Cent China Normal Univ, Sch Life Sci, Wuhan 430079, Hubei, Peoples R China.;[Wan, Cuihong; Peng, Zhao] Cent China Normal Univ, Hubei Key Lab Genet Regulat & Integrat Biol, Wuhan 430079, Hubei, Peoples R China.;[Li, Jiaqiang; Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Li, Jiaqiang; Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Xingpeng Jiang; Cuihong Wan] S;School of Computer, and Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University , Wuhan 430079, Hubei , People's Republic of China<&wdkj&>School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University , Wuhan 430079, Hubei , People's Republic of China
摘要:
As one of the essential life forms in the biosphere, research on cyanobacteria has been growing remarkably for decades. Biological functions in organisms are often accomplished through protein-protein interactions (PPIs), which help to regulate interacting proteins or organize them into an integral machine. However, the study of PPIs in cyanobacteria falls far behind that in mammals and has not been integrated for ease of use. Thus, we built CyanoMapDB (http://www.cyanomapdb.msbio.pro/), a database providing cyanobacterial PPIs with experimental evidence, consisting of 52,304 PPIs among 6,789 proteins from 23 cyanobacterial species. We collected available data in UniProt, STRING, and IntAct, and mined numerous PPIs from co-fractionation MS data in cyanobacteria. The integrated data are accessible in CyanoMapDB (http://www.cyanomapdb.msbio.pro/), enabling users to easily query proteins of interest, investigate interacting proteins with evidence from different sources, and acquire a visual network of the target protein. We believe that CyanoMapDB will promote research involved with cyanobacteria and plants.
期刊:
Information Processing & Management,2023年60(1):103114 ISSN:0306-4573
通讯作者:
Weizhong Zhao
作者机构:
[Zhao, Weizhong; Xia, Jun; He, Tingting; Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Hubei, Peoples R China.;[Zhao, Weizhong] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Zhao, Weizhong] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Weizhong Zhao] H;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, Hubei, China<&wdkj&>School of Computer, Central China Normal University, Wuhan 430079, Hubei, China<&wdkj&>National Language Resources Monitoring and Research Center for Network Media, Central China Normal University, Wuhan 430079, Hubei, China
关键词:
Deep knowledge tracing;Forgetting and learning mechanisms;Intelligent education
期刊:
IEEE/ACM Transactions on Computational Biology and Bioinformatics,2022年19(1):513-521 ISSN:1545-5963
通讯作者:
Zhang, XF
作者机构:
[Tan, Yu-Ting; Zhang, Xiao-Fei] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Hubei, Peoples R China.;[Tan, Yu-Ting; Zhang, Xiao-Fei] Cent China Normal Univ, Hubei Key Lab Math Sci, Wuhan 430079, Hubei, Peoples R China.;[Ou-Yang, Le] Shenzhen Univ, Coll Informat Engn, Shenzhen 518060, Guangdong, Peoples R China.;[Ou-Yang, Le] Shenzhen Univ, Shenzhen Key Lab Media Secur, Shenzhen 518060, Guangdong, Peoples R China.;[Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Zhang, XF ] C;Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Hubei, Peoples R China.;Cent China Normal Univ, Hubei Key Lab Math Sci, Wuhan 430079, Hubei, Peoples R China.
摘要:
It is an important task to learn how gene regulatory networks change under different conditions. Several Gaussian graphical model-based methods have been proposed to deal with this task by inferring differential networks from gene expression data. However, most existing methods define the differential networks as the difference of precision matrices, which may include false differential edges caused by the change of conditional variances. In addition, prior information about the condition-specific networks and the differential networks can be obtained from other domains. It is useful to incorporate prior information into differential network analysis. In this study, we propose a new differential network analysis method to address the above challenges. Instead of using the precision matrices, we define the differential networks as the difference of partial correlations, which can exclude the spurious differential edges due to the variants of conditional variances. Furthermore, prior information from multiple hypothesis testing is incorporated using a weighted fused penalty. Simulation studies show that our method outperforms the competing methods. We also apply our method to identify the differential network between luminal A and basal-like subtypes of breast cancers and the differential network between acute myeloid leukemia tumors and normal samples. The hub genes in the differential networks identified by our method carry out important biological functions.
作者机构:
[Zhong, Duo; Jiang, Xingpeng; Li, Bojing] Cent China Normal Univ, Hubei Key Lab Artificial Intelligence & Smart Lear, Wuhan, Peoples R China.;[Zhong, Duo; Jiang, Xingpeng; Li, Bojing] Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.;[Qiao, Jimei] Shanghai Normal Univ, Math & Sci Coll, Shanghai, Peoples R China.;[Jiang, Xingpeng] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan, Peoples R China.
通讯机构:
[Xingpeng Jiang] H;Hubei Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China<&wdkj&>School of Computer, Central China Normal University, Wuhan, China<&wdkj&>National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, Wuhan, China
摘要:
Microorganisms play important roles in our lives especially on metabolism and diseases. Determining the probability of human suffering from specific diseases and the severity of the disease based on microbial genes is the crucial research for understanding the relationship between microbes and diseases. Previous could extract the topological information of phylogenetic trees and integrate them to metagenomic datasets, thus enable classifiers to learn more information in limited datasets and thus improve the performance of the models. In this paper, we proposed a GNPI model to better learn the structure of phylogenetic trees. GNPI maintained the original vector format of metagenomic datasets, while previous research had to change the input form to matrices. The vector-like form of the input data can be easily adopted in the baseline machine learning models and is available for deep learning models. The datasets processed with GNPI help enhance the accuracy of machine learning and deep learning models in three different datasets. GNPI is an interpretable data processing method for host phenotype prediction and other bioinformatics tasks.
摘要:
The increasing prevalence of antibiotic resistance has become a global health crisis. For the purpose of safety regulation, it is of high importance to identify antibiotic resistance genes (ARGs) in bacteria. Although culture-based methods can identify ARGs relatively more accurately, the identifying process is time-consuming and specialized knowledge is required. With the rapid development of whole genome sequencing technology, researchers attempt to identify ARGs by computing sequence similarity from public databases. However, these computational methods might fail to detect ARGs due to the low sequence identity to known ARGs. Moreover, existing methods cannot effectively address the issue of multidrug resistance prediction for ARGs, which is a great challenge to clinical treatments. To address the challenges, we propose an end-to-end multi-label learning framework for predicting ARGs. More specifically, the task of ARGs prediction is modeled as a problem of multi-label learning, and a deep neural network-based end-to-end framework is proposed, in which a specific loss function is introduced to employ the advantage of multi-label learning for ARGs prediction. In addition, a dual-view modeling mechanism is employed to make full use of the semantic associations among two views of ARGs, i.e. sequence-based information and structure-based information. Extensive experiments are conducted on publicly available data, and experimental results demonstrate the effectiveness of the proposed framework on the task of ARGs prediction.
期刊:
IEEE/ACM Transactions on Computational Biology and Bioinformatics,2022年19(3):1322-1333 ISSN:1545-5963
通讯作者:
Jiang, X.
作者机构:
[He, Tingting; Jiang, Xingpeng; Ma, Yingjun] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Ma, Yingjun] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Hubei, Peoples R China.;[He, Tingting; Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.;[Tan, Yuting] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Peoples R China.;[Tan, Yuting] Hubei Key Lab Math Sci, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
Central China Normal University, School of Computer, Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Hubei, Wuhan, China
会议名称:
18th Asia Pacific Bioinformatics Conference (APBC)
会议时间:
AUG 18-20, 2020
会议地点:
ELECTR NETWORK
会议主办单位:
[Ma, Yingjun;He, Tingting;Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.^[Ma, Yingjun] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Hubei, Peoples R China.^[He, Tingting;Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.^[Tan, Yuting] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Peoples R China.^[Tan, Yuting] Hubei Key Lab Math Sci, Wuhan 430079, Hubei, Peoples R China.
摘要:
Infectious diseases are currently the most important and widespread health problem, and identifying viral infection mechanisms is critical for controlling diseases caused by highly infectious viruses. Because of the lack of non-interactive protein pairs and serious imbalance between positive and negative sample ratios, the supervised learning algorithm is not suitable for prediction. At the same time, due to the lack of information on viral proteins and significant dissimilarity in sequence, some ensemble learning models have poor generalization ability. In this paper, we propose a Sequence-Based Ensemble Learning (Seq-BEL) method to predict the potential virus-human PPIs. Specifically, based on the amino acid sequence of proteins and the currently known virus-human PPI network, Seq-BEL calculates various features and similarities of human proteins and viral proteins, and then combines these similarities and features to score the potential of virus-human PPIs. The computational results show that Seq-BEL achieves success in predicting potential virus-human PPIs and outperforms other state-of-the-art methods. More importantly, Seq-BEL also has good predictive performance for new human proteins and new viral proteins. In addition, the model has the advantages of strong robustness and good generalization ability, and can be used as an effective tool for virus-human PPI prediction.
摘要:
Over the past decades, Chemical-induced Disease (CID) relations have attracted extensive attention in biomedical community, reflecting wide applications in biomedical research and healthcare field. However, prior efforts fail to make full use of the interaction between local and global contexts in biomedical document, and the derived performance needs to be improved accordingly. In this paper, we propose a novel framework for document-level CID relation extraction. More specifically, a stacked Hypergraph Aggregation Neural Network (HANN) layers are introduced to model the complicated interaction between local and global contexts, based on which better contextualized representations are obtained for CID relation extraction. In addition, the CID Relation Heterogeneous Graph is constructed to capture the information with different granularities and improve further the performance of CID relation classification. Experiments on a real-world dataset demonstrate the effectiveness of the proposed framework.
期刊:
BRIEFINGS IN BIOINFORMATICS,2022年23(5) ISSN:1467-5463
通讯作者:
Jiang, XP
作者机构:
[Tan, Yuting; Sun, Han] Cent China Normal Univ, Sch Math & Stat, Wuhan, Peoples R China.;[Jiang, Xingpeng; Huo, Ban; He, Tingting] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;[Huang, Xiaoyun] Cent China Normal Univ, Collaborat & Innovat Ctr Educ Technol, Wuhan, Peoples R China.;[Jiang, Xingpeng] Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan, Peoples R China.
通讯机构:
[Jiang, XP ] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
关键词:
microbiome-based association test;longitudinal microbiome data;generalized estimating equations;sparse microbial association signals;higher criticism
摘要:
The association between the compositions of microbial communities and various host phenotypes is an important research topic. Microbiome association research addresses multiple domains, such as human disease and diet. Statistical methods for testing microbiome–phenotype associations have been studied recently to determine their ability to assess longitudinal microbiome data. However, existing methods fail to detect sparse association signals in longitudinal microbiome data. In this paper, we developed a novel method, namely aGEEMIHC, which is a data-driven adaptive microbiome higher criticism analysis based on generalized estimating equations to detect sparse microbial association signals from longitudinal microbiome data. aGEEMiHC adopts generalized estimating equations framework that fully considers the correlation among different observations from the same subject in longitudinal data. To be robust to diverse correlation structures for longitudinal data, aGEEMiHC integrates multiple microbiome higher criticism analyses based on generalized estimating equations with different working correlation structures. Extensive simulation experiments demonstrate that aGEEMiHC can control the type I error correctly and achieve superior performance according to a statistical power comparison. We also applied it to longitudinal microbiome data with various types of host phenotypes to demonstrate the stability of our method. aGEEMiHC is also utilized for real longitudinal microbiome data, and we found a significant association between the gut microbiome and Crohn’s disease. In addition, our method ranks the significant factors associated with the host phenotype to provide potential biomarkers.
作者机构:
[Ma, Yingjun] Xiamen Univ Technol, Sch Appl Math, Xiamen, Peoples R China.;[Ma, Yuanyuan] Anyang Normal Univ, Sch Comp & Informat Engn, Anyang, Peoples R China.;[Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.;[Jiang, Xingpeng] Cent China Normal Univ, Hubei Key Lab Artificial Intelligence & Smart Lear, Wuhan, Peoples R China.
通讯机构:
[Xingpeng Jiang] S;School of Computer, Central China Normal University, Wuhan, China<&wdkj&>Hubei Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China
摘要:
Microbial community is an important part of organisms or ecosystems to maintain health and stability. Analyzing the interaction of microorganisms in the ecosystem and mining the co-occurrence module of the microbial community can deepen the understanding of microbial community function. This could also improve the ability to manipulate the microbial community, thus provide new means for ecological restoration, disease treatment and drug development. Instead of the investigations of pairwise relationships, more and more studies have realized that the higher-order interactions may play important roles in explaining the diversity and complexity of the community. In this study, a hypergraph clustering (HCMFP) based on modularity feature projection is proposed to detect the microbial community in higher-order interaction network among microbes. Specifically, HCMFP uses information entropy to mine the higher-order logical relationships among microbes, and constructs a hypergraph learning model based on modularity feature projection to detect the microbial community. The experimental results show that compared with other methods, HCMFP has better clustering performance and reliable convergence speed. The proposed method is an effective tool for high-order organizations in microbial interaction network. The code and data in this study is freely available at https://github.com/Mayingjun20179/ HCMFP.
作者机构:
[Wu, Haifang; Zhao, Weizhong; He, Tingting; Jiang, Xingpeng; Luo, Shujie] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.;[Wu, Haifang; Zhao, Weizhong; He, Tingting; Jiang, Xingpeng; Luo, Shujie] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Wu, Haifang; Zhao, Weizhong; He, Tingting; Jiang, Xingpeng; Luo, Shujie] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netw, Wuhan 430079, Hubei, Peoples R China.;[Zhao, Weizhong] Guilin Univ Elect Technol, Guangxi Key Lab Trusted Software, Guilin 541004, Peoples R China.;[Zhao, Weizhong] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China.
会议名称:
26th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD)
会议时间:
MAY 16-19, 2022
会议地点:
SW Jiaotong Univ, Chengdu, PEOPLES R CHINA
会议主办单位:
SW Jiaotong Univ
会议论文集名称:
Lecture Notes in Artificial Intelligence
关键词:
Protein interface prediction;Sequence information;Structure information;Hybrid attention mechanism
摘要:
Protein interface prediction is fundamental to understand the hidden principles of many living activities. Although many approaches to the task of protein interface prediction have been proposed, most of existing methods fail to make full use of the available sequence information and structure information. To address the challenge, we propose a deep learning-based end-to-end framework for protein interface prediction, in which a hybrid attention mechanism is utilized to take into account the semantic associations and complementary effect between both sequence and structure information. More specifically, a cross-modal attention is built to capture the semantic associations between sequence representations and structure representations for proteins. In addition, a type-level attention is introduced to model the different contributions of sequence and structure information for predicting protein interaction interface. Experimental results on three commonly used datasets demonstrate the effectiveness of the proposed method.