期刊:
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS,2024年PP:1-12 ISSN:2168-2194
作者机构:
[Xueli Pan; Frank van Harmelen] Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China;National Language Resources Monitor Research Center for Network Media, Central China Normal University, Wuhan, China;School of Computer Science, Central China Normal University, Wuhan, China
摘要:
It is commonly known that food nutrition is closely related to human health. The complex interactions between food nutrients and diseases, influenced by gut microbial metabolism, present challenges in systematizing and practically applying knowledge. To address this, we propose a method for extracting triples from a vast amount of literature, which is used to construct a comprehensive knowledge graph on nutrition and human health. Concurrently, we develop a query-based question answering system over our knowledge graph, proficiently addressing three types of questions. The results show that our proposed model outperforms other state-of-art methods, achieving a precision of 0.92, a recall of 0.81, and an F1 score of 0.86in the nutrition and disease relation extraction task. Meanwhile, our question answering system achieves an accuracy of 0.68 and an F1 score of 0.61 on our benchmark dataset, showcasing competitiveness in practical scenarios. Furthermore, we design five independent experiments to assess the quality of the data structure in the knowledge graph, ensuring results characterized by high accuracy and interpretability. In conclusion, the construction of our knowledge graph shows significant promise in facilitating diet recommendations, enhancing patient care applications, and informing decision-making in clinical research.
期刊:
BRIEFINGS IN BIOINFORMATICS,2024年25(3) ISSN:1467-5463
通讯作者:
Wan, CH;Jiang, XP
作者机构:
[Wan, Cuihong; Wan, CH; Peng, Zhao] Cent China Normal Univ, Sch Life Sci, 152 Luoyu Rd, Wuhan 430079, Peoples R China.;[Jiang, Xingpeng; Li, Jiaqiang] Cent China Normal Univ, Sch Comp Sci, 382 Xiongchu Ave, Wuhan 430079, Peoples R China.;[Wan, CH] Cent China Normal Univ, Hubei Key Lab Genet Regulat & Integrat Biol, 152 Luoyu Rd, Wuhan 430079, Peoples R China.;[Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, 382 Xiongchu Ave, Wuhan 430079, Peoples R China.
通讯机构:
[Wan, CH ; Jiang, XP ] C;Cent China Normal Univ, Sch Life Sci, 152 Luoyu Rd, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Hubei Key Lab Genet Regulat & Integrat Biol, 152 Luoyu Rd, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Sch Comp Sci, 382 Xiongchu Ave, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, 382 Xiongchu Ave, Wuhan 430079, Peoples R China.
关键词:
small ORF;microprotein;coding potential;genome annotation;machine learning
摘要:
Small open reading frames (smORFs) have been acknowledged to play various roles on essential biological pathways and affect human beings from diabetes to tumorigenesis. Predicting smORFs in silico is quite a prerequisite for processing the omics data. Here, we proposed the smORF-coding-potential-predicting framework, sOCP, which provides functions to construct a model for predicting novel smORFs in some species. The sOCP model constructed in human was based on in-frame features and the nucleotide bias around the start codon, and the small feature subset was proved to be competent enough and avoid overfitting problems for complicated models. It showed more advanced prediction metrics than previous methods and could correlate closely with experimental evidence in a heterogeneous dataset. The model was applied to Rattus norvegicus and exhibited satisfactory performance. We then scanned smORFs with ATG and non-ATG start codons from the human genome and generated a database containing about a million novel smORFs with coding potential. Around 72 000 smORFs are located on the lncRNA regions of the genome. The smORF-encoded peptides may be involved in biological pathways rare for canonical proteins, including glucocorticoid catabolic process and the prokaryotic defense system. Our work provides a model and database for human smORF investigation and a convenient tool for further smORF prediction in other species.
作者机构:
[Zhiming Luo] Department of Artificial Intelligence, Xiamen University, Xiamen, China;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning;National Language Resources Monitoring and Research Center for Network Media;School of Computer Science, Central China Normal University, Wuhan, China;[Jingzhe Li; Chengji Wang; Yuxian Wu; Xingpeng Jiang] Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning<&wdkj&>National Language Resources Monitoring and Research Center for Network Media<&wdkj&>School of Computer Science, Central China Normal University, Wuhan, China
会议名称:
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
会议时间:
14 April 2024
会议地点:
Seoul, Korea, Republic of
会议论文集名称:
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
摘要:
Recognizing human feelings from image and text is a core challenge of multi-modal data analysis, often applied in personalized advertising. Previous works aim at exploring the shared features, which are the matched contents between images and texts. However, the modality-dependent sentiment information (private features) in each modality is usually ignored by cross-modal interactions, the real sentiment is often reflected in one modality. In this paper, we propose a Modality-Dependent Sentiment Exploring framework (MDSE). First, to exploit the private features, we compare shared features with original image or text features, identifying previously overlooked unimodal features. Fusing the private and shared features can make the model more robust. Second, in order to obtain unified sentiment representations, we treat unimodal features and multi-modal fused features equally. We introduce a Modality-Agnostic Contrastive Loss (MACL) that performs contrastive learning between unimodal features and multi-modal fused features. The MACL can fully exploit sentiment information from multi-modal data and reduce the modality gap. Experiments on four public datasets demonstrate the effectiveness of our MDSE compared with existing methods. The full codes are available at https://github.com/royal-dargon/MDSE.
作者机构:
[Jiang, Xingpeng; Zhao, Weizhong; He, Tingting; Zhao, WZ; Wu, Junze] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Hubei, Peoples R China.;[Zhao, Weizhong; Zhao, WZ] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Zhao, Weizhong] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Hubei, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
通讯机构:
[Jiang, XP ; Zhao, WZ ; Zhao, WZ] C;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Hubei, Peoples R China.;Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
摘要:
<jats:title>Abstract</jats:title>
<jats:sec>
<jats:title>Motivation</jats:title>
<jats:p>The crisis of antibiotic resistance, which causes antibiotics used to treat bacterial infections to become less effective, has emerged as one of the foremost challenges to public health. Identifying the properties of antibiotic resistance genes (ARGs) is an essential way to mitigate this issue. Although numerous methods have been proposed for this task, most of these approaches concentrate solely on predicting antibiotic class, disregarding other important properties of ARGs. In addition, existing methods for simultaneously predicting multiple properties of ARGs fail to account for the causal relationships among these properties, limiting the predictive performance.</jats:p>
</jats:sec>
<jats:sec>
<jats:title>Results</jats:title>
<jats:p>In this study, we propose a causality-guided framework for annotating properties of ARGs, in which causal inference is utilized for representation learning. More specifically, the hidden biological patterns determining the properties of ARGs are described by a Gaussian Mixture Model, and procedure of causal representation learning is used to derive the hidden features. In addition, a causal graph among different properties is constructed to capture the causal relationships among properties of ARGs, which is integrated into the task of annotating properties of ARGs. The experimental results on a real-world dataset demonstrate the effectiveness of the proposed framework on the task of annotating properties of ARGs.</jats:p>
</jats:sec>
<jats:sec>
<jats:title>Availability and implementation</jats:title>
<jats:p>The data and source codes are available in GitHub at https://github.com/David-WZhao/CausalARG.</jats:p>
</jats:sec>
作者机构:
[Lv, Xing; Tu, Xinhui; Zhao, Weizhong; Zhao, WZ; Deng, Jie; Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.;[Lv, Xing; Tu, Xinhui; Zhao, Weizhong; Zhao, WZ; Deng, Jie; Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Tu, Xinhui; Zhao, Weizhong; Zhao, WZ; Jiang, Xingpeng] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netw, Wuhan 430079, Hubei, Peoples R China.
会议名称:
20th International Symposium on Bioinformatics Research and Applications (ISBRA)
会议时间:
JUL 19-21, 2024
会议地点:
Kunming, PEOPLES R CHINA
会议主办单位:
[Lv, Xing;Deng, Jie;Zhao, Weizhong;Tu, Xinhui;Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.^[Lv, Xing;Deng, Jie;Zhao, Weizhong;Tu, Xinhui;Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.^[Zhao, Weizhong;Tu, Xinhui;Jiang, Xingpeng] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netw, Wuhan 430079, Hubei, Peoples R China.
会议论文集名称:
Lecture Notes in Bioinformatics
关键词:
Antibacterial resistant genes;Biocide and metal resistance;Hierarchical classification;Deep learning;Multi-task learning
摘要:
Bacteria with resisting antibacterial biocide and metal agents have become a commonly global concern for public health and environment protection. Since the resistance of bacteria is mainly attributed to expression of certain genes, accurately annotating the antibacterial biocide and metal resistance genes is the fundamental step to address or mitigate the global concern. However, due to the complex biological mechanisms of bacterial resistance for antibacterial biocides and metals, existing prediction methods face challenges in accurately categorizing antibacterial biocide and metal resistance genes, leading to undesirable prediction performance. In this paper, we propose a Hierarchical Classification Model for annotating Antibacterial Biocide and Metal Resistance Genes (HCM-ABMRGs) by considering the global and local semantics contained in gene-coded protein sequences. More specifically, the task of annotating antibacterial biocide and metal resistance genes is treated as a three-level classification problem, in which properties of genes are annotated from coarse to refined granularity. In addition, global and local semantics contained in protein sequences are explicitly captured to derive more meaningful representations for annotating genes. Comprehensive experiments are conducted on a widely used dataset, and experimental results demonstrate the effectiveness of the proposed method.
期刊:
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS,2024年28(7):4348-4360 ISSN:2168-2194
通讯作者:
Zhao, WZ
作者机构:
[Zhao, Weizhong; Zhao, WZ; He, Tingting; Jiang, Xingpeng; Wu, Junze; Luo, Shujie] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;[Zhao, Weizhong; Zhao, WZ; He, Tingting; Jiang, Xingpeng; Wu, Junze; Luo, Shujie] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;[Zhao, Weizhong; Zhao, WZ; He, Tingting; Jiang, Xingpeng; Wu, Junze; Luo, Shujie] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr netwo, Wuhan 430079, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
通讯机构:
[Zhao, WZ ] C;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr netwo, Wuhan 430079, Peoples R China.
摘要:
The crisis of antibiotic resistance has become a significant global threat to human health. Understanding properties of antibiotic resistance genes (ARGs) is the first step to mitigate this issue. Although many methods have been proposed for predicting properties of ARGs, most of these methods focus only on predicting antibiotic classes, while ignoring other properties of ARGs, such as resistance mechanisms and transferability. However, acquiring all of these properties of ARGs can help researchers gain a more comprehensive understanding of the essence of antibiotic resistance, which will facilitate the development of antibiotics. In this paper, the task of predicting properties of ARGs is modeled as a multi-task learning problem, and an effective subtask-aware representation learning-based framework is proposed accordingly. More specifically, property-specific expert networks and shared expert networks are utilized respectively to learn subtask-specific features for each subtask and shared features among different subtasks. In addition, a gating-controlled mechanism is employed to dynamically allocate weights to subtask-specific semantics and shared semantics obtained respectively from property-specific expert networks and shared expert networks, thus adjusting distinctive contributions of subtask-specific features and shared features to achieve optimal performance for each subtask simultaneously. Extensive experiments are conducted on publicly available data, and experimental results demonstrate the effectiveness of the proposed framework on the task of ARGs properties prediction.
作者:
Yi Jia;Shanshan Zheng;Tingting He;Xingpeng Jiang
作者机构:
School of Mathematics and Statistics, Central China Normal University, WuHan, PR China;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, WuHan, PR China;School of Computer, Central China Normal University, WuHan, PR China;National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, WuHan, PR China;[Yi Jia] School of Mathematics and Statistics, Central China Normal University, WuHan, PR China<&wdkj&>Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, WuHan, PR China<&wdkj&>School of Computer, Central China Normal University, WuHan, PR China
会议名称:
2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
会议时间:
05 December 2023
会议地点:
Istanbul, Turkiye
会议论文集名称:
2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
摘要:
Despite profound impacts on human health and nature, accurately predicting microbe-metabolite interactions remains challenging due to inherent data noise. This study applies non-negative matrix factorization (NMF) and multi-view NMF to reduce noise and exploit associations across data perspectives. NMF obtains low-dimensional microbial and metabolic representations, effectively reducing noise. The dimension-reduced spectral matrices were input into the generative network model to derive conditional probabilities of individual microbe-associated metabolites and microbe-metabolite co-occurrence probabilities, the latter enabling prediction of microbe-metabolite interactions. Moreover, multi-view NMF integrates microbial and metabolic data by mapping them into a shared subspace, thereby enhancing prediction performance and validating cross-perspective correlation modeling. This study demonstrates NMF's efficacy in noise reduction through dimensionality reduction, and multiview NMF's ability to leverage cross-view associations. Both approaches demonstrate enhanced microbe-metabolite interaction prediction utilizing NMF-based and multi-view NMF-based generative network models.
作者机构:
School of Computer, Central China Normal University, WuHan, PR China;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, WuHan, PR China;National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, WuHan, PR China;[Yanan Yao; Tian Yu; Huanghan Zhan; Weizhong Zhao; Tingting He; Xingpeng Jiang] School of Computer, Central China Normal University, WuHan, PR China<&wdkj&>Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, WuHan, PR China<&wdkj&>National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, WuHan, PR China
会议名称:
2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
会议时间:
05 December 2023
会议地点:
Istanbul, Turkiye
会议论文集名称:
2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
摘要:
Antibiotic resistance event extraction involves the automated extraction of information related to antibiotic resistance mechanisms from a vast amount of biomedical literature. This can be achieved by utilizing natural language processing techniques. However, the distinctive characteristics of the biomedical field lead to various challenges for existing antibiotic resistance event extraction methods, such as limited labeling data, complex names of biomedical entities, and nesting and overlapping event structures. These factors make it challenging to apply the current processing methods for biomedical text to the task of antibiotic resistance event extraction. To address these challenges, we propose a cascade decoding approach for antibiotic resistance event extraction based on contrastive learning (CL-MA-CasEE). This approach achieves data augmentation by constructing two contrastive learning tasks, which combines entity type embedding and POS embedding to enrich the semantic information of word representations. Furthermore, it performs event type detection, event trigger extraction, and event argument extraction through using three cascade decoders to simulate the complex event structures. Based on experiments, we demonstrate that our method can effectively extract structured antibiotic resistance event information from biomedical literature, thereby improve the efficiency of event extraction tasks as well.
期刊:
Artificial Intelligence in Medicine,2023年145:102677 ISSN:0933-3657
通讯作者:
Jiang, XP
作者机构:
[Fu, Chengcheng] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;[Jiang, Xingpeng; Fu, Chengcheng; He, Tingting] Cent China Normal Univ, Sch Comp Sci, Wuhan, Peoples R China.;[Fu, Chengcheng; van Harmelen, Frank; Huang, Zhisheng] Vrije Univ Amsterdam, Dept Comp Sci, Amsterdam, Netherlands.;[Fu, Chengcheng; He, Tingting; Jiang, Xingpeng] Cent China Normal Univ, Natl Language Resources Monitor Res Ctr Network Me, Wuhan, Peoples R China.;[Huang, Zhisheng] Tongji Univ, Sch Med, Clin Res Ctr Mental Disorders, Shanghai Pudong New Area Mental Hlth Ctr, Shanghai, Peoples R China.
通讯机构:
[Jiang, XP ] C;Cent China Normal Univ, Sch Comp Sci, Wuhan, Peoples R China.
关键词:
Food;Gut microbiota;Knowledge graph;Mental health
作者机构:
School of Computer, Central China Normal University, WuHan, PR China;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, WuHan, PR China;National Language Resources Monitor Research Center for Network Media, Central China Normal University, WuHan, PR China;National Engineering Research Center for E-Learning, Central China Normal University, WuHan, PR China;[Chengcheng Fu; Tingting He] School of Computer, Central China Normal University, WuHan, PR China<&wdkj&>National Engineering Research Center for E-Learning, Central China Normal University, WuHan, PR China<&wdkj&>Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, WuHan, PR China<&wdkj&>National Language Resources Monitor Research Center for Network Media, Central China Normal University, WuHan, PR China
会议名称:
2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
会议时间:
05 December 2023
会议地点:
Istanbul, Turkiye
会议论文集名称:
2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
关键词:
Knowledge graph;Multimodal embedding;Knowledge reasoning;Nutrition;Human health
摘要:
The established links between nutrition and human health are widely acknowledged. Dietary nutrients play a crucial role in regulating gut microbial communities, influencing various human diseases. With a growing number of related studies, there’s a need to systematically organize these associations for coherent knowledge reasoning. However, due to the diverse and extensive nature of the knowledge landscape, significant challenges persist. To address this, we propose an approach using multimodal data and knowledge embeddings for effective knowledge reasoning in nutrition and human health. We create a comprehensive knowledge graph, KG4NH, covering dietary nutrition, gut microbiota, and human diseases. To ensure efficient knowledge representation, we employ knowledge embedding techniques to develop modality-specific encoders for structure, category, and description. Additionally, we introduce a mul-timodal fusion method to capture shared information across modalities. Our experimental results demonstrate the superiority of our approach over other state-of-the-art methods.
期刊:
BRIEFINGS IN BIOINFORMATICS,2023年24(2) ISSN:1467-5463
通讯作者:
Xingpeng Jiang
作者机构:
[Wang, Haodong; Wang, Yue; Xiao, Zhen; Huang, Xiaoyun; He, Tingting; Jiang, Xingpeng; Sun, Han] Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China;[Wang, Haodong; Wang, Yue; Xiao, Zhen; Huang, Xiaoyun; He, Tingting; Jiang, Xingpeng; Sun, Han] School of Computer Science, Central China Normal University, Wuhan 430079, China;[Xiao, Zhen; Sun, Han] School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China;[Huang, Xiaoyun] Collaborative & Innovative Center for Educational Technology, Central China Normal University, Wuhan 430079, China;[He, Tingting; Jiang, Xingpeng] National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, Wuhan 430079, China
通讯机构:
[Xingpeng Jiang] H;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University , Wuhan 430079 , China<&wdkj&>School of Computer Science, Central China Normal University , Wuhan 430079 , China<&wdkj&>National Language Resources Monitoring & Research Center for Network Media, Central China Normal University , Wuhan 430079 , China
关键词:
Kernel machine regression;Microbiome-based association test;Multinomial logit model;Ordinal/Nominal multicategory phenotypes
摘要:
<jats:title>Abstract</jats:title><jats:p>Microbes can affect the metabolism and immunity of human body incessantly, and the dysbiosis of human microbiome drives not only the occurrence but also the progression of disease (i.e. multiple statuses of disease). Recently, microbiome-based association tests have been widely developed to detect the association between the microbiome and host phenotype. However, the existing methods have not achieved satisfactory performance in testing the association between the microbiome and ordinal/nominal multicategory phenotypes (e.g. disease severity and tumor subtype). In this paper, we propose an optimal microbiome-based association test for multicategory phenotypes, namely, multiMiAT. Specifically, under the multinomial logit model framework, we first introduce a microbiome regression-based kernel association test for multicategory phenotypes (multiMiRKAT). As a data-driven optimal test, multiMiAT then integrates multiMiRKAT, score test and MiRKAT-MC to maintain excellent performance in diverse association patterns. Massive simulation experiments prove the success of our method. Furthermore, multiMiAT is also applied to real microbiome data experiments to detect the association between the gut microbiome and clinical statuses of colorectal cancer as well as for diverse statuses of Clostridium difficile infections.</jats:p>
作者机构:
Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China;School of Computer, Central China Normal University, Wuhan, China;National Language Resources Monitoring and Research Center for Network Media, Central China Normal University, Wuhan, China;[Xiaowei Xu] Department of Information Science, University of Arkansas at Little Rock, Little Rock, AR, USA;[Xueling Yuan] Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China<&wdkj&>School of Computer, Central China Normal University, Wuhan, China
会议名称:
2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
会议时间:
05 December 2023
会议地点:
Istanbul, Turkiye
会议论文集名称:
2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
关键词:
drug–drug interaction;cold-start scenario;counterfactual inference;meta-path-based fusion;heterogeneous information network
摘要:
Drug-drug interaction (DDI) pertains to the occurrence where the concomitant use of two or more drugs may lead to interactions in terms of their pharmacokinetic or pharmacodynamic behavior, resulting in unexpected effects. Accurately predicting DDIs holds significant importance in ensuring drug safety. Despite the numerous approaches proposed for DDI prediction, a majority of these methods often overlook the challenge presented by cold-start scenario, consequently limiting their applicability. This paper presents a novel data augmentation approach for the prediction of DDIs in cold-start scenarios. This method leverages counterfactual inference to generate meaningful pseudo samples for drugs with limited prior information. To achieve this, a HIN relevant to DDIs is initially established by amalgamating various associations between drugs and proteins. Subsequently, the identification of drug communities within this HIN is regarded as a form of counterfactual inference treatment, facilitating the generation of counterfactual links for cold-start drugs and thereby augmenting the training dataset. Lastly, we enhance our understanding of drug characteristics through a meta-path-based fusion mechanism, ultimately improving the accuracy of DDIs prediction in cold-start scenarios. We substantiate the effectiveness of our proposed method through an extensive series of experiments.
作者机构:
[Wan, Cuihong; Peng, Zhao] Cent China Normal Univ, Sch Life Sci, Wuhan 430079, Hubei, Peoples R China.;[Wan, Cuihong; Peng, Zhao] Cent China Normal Univ, Hubei Key Lab Genet Regulat & Integrat Biol, Wuhan 430079, Hubei, Peoples R China.;[Li, Jiaqiang; Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Li, Jiaqiang; Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Xingpeng Jiang; Cuihong Wan] S;School of Computer, and Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University , Wuhan 430079, Hubei , People's Republic of China<&wdkj&>School of Life Sciences, and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University , Wuhan 430079, Hubei , People's Republic of China
摘要:
<jats:title>Abstract</jats:title>
<jats:p>As one of the essential life forms in the biosphere, research on cyanobacteria has been growing remarkably for decades. Biological functions in organisms are often accomplished through protein–protein interactions (PPIs), which help to regulate interacting proteins or organize them into an integral machine. However, the study of PPIs in cyanobacteria falls far behind that in mammals and has not been integrated for ease of use. Thus, we built CyanoMapDB (http://www.cyanomapdb.msbio.pro/), a database providing cyanobacterial PPIs with experimental evidence, consisting of 52,304 PPIs among 6,789 proteins from 23 cyanobacterial species. We collected available data in UniProt, STRING, and IntAct, and mined numerous PPIs from co-fractionation MS data in cyanobacteria. The integrated data are accessible in CyanoMapDB (http://www.cyanomapdb.msbio.pro/), enabling users to easily query proteins of interest, investigate interacting proteins with evidence from different sources, and acquire a visual network of the target protein. We believe that CyanoMapDB will promote research involved with cyanobacteria and plants.</jats:p>
期刊:
Information Processing & Management,2023年60(1):103114 ISSN:0306-4573
通讯作者:
Weizhong Zhao
作者机构:
[Zhao, Weizhong; Xia, Jun; He, Tingting; Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Hubei, Peoples R China.;[Zhao, Weizhong] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Zhao, Weizhong] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Weizhong Zhao] H;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, Hubei, China<&wdkj&>School of Computer, Central China Normal University, Wuhan 430079, Hubei, China<&wdkj&>National Language Resources Monitoring and Research Center for Network Media, Central China Normal University, Wuhan 430079, Hubei, China
关键词:
Deep knowledge tracing;Forgetting and learning mechanisms;Intelligent education
期刊:
IEEE/ACM Transactions on Computational Biology and Bioinformatics,2023年20(6):3635-3647 ISSN:1545-5963
作者机构:
[Shi, Chuan] School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China;[Yao, Wenjie; Jiang, Xingpeng; He, Tingting] School of Computer, Central China Normal University, Wuhan, Hubei, China;[Hu, Xiaohua] College of Computing &Informatics, Drexel University, Philadelphia, PA, USA;[Zhao, Weizhong] Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, School of Computer, and National Language Resources Monitoring and Research Center for Network Media, Central China Normal University, Wuhan, Hubei, China
摘要:
Side effects of drugs have gained increasing attention in the biomedical field, and accurate identification of drug side effects is essential for drug development and drug safety surveillance. Although the traditional pharmacological experiments can accurately detect the side effects of drugs, the identifying process is time-consuming, costly, and may lead to incomplete identification of side effects. With the expanding of various biomedical databases, many computational methods have been developed for the task of drug-side effect associations (DSAs) prediction. However, existing methods have the following three drawbacks: 1). multiple drug-related databases are not fully used; 2). the complex semantics among drugs and side effects are not effectively captured; 3). the explainability of the predicted DSAs is missed for most existing methods. Therefore, there is an urgent need to find a more effective method for predicting DSAs. To address these issues, we propose a novel meta-path-based graph neural network model for drug-side effect associations prediction (MPGNN-DSA). In MPGNN-DSA, a heterogeneous information network is first constructed by combining multiple biological datasets. Then, a meta-path-based feature learning module is utilized for learning high-quality representations of drugs and side effects by capturing the semantics contained in meta-paths of the constructed HIN. With the learned features, the prediction module is conducted to derive the predicted side effects for drugs. In addition, the explainability of the predicted DSAs can be provided as well with the semantics contained in meta-paths. We conduct comprehensive experiments, and the results demonstrate the effectiveness of MPGNN-DSA, suggesting that the proposed method will be a feasible solution to the task of DSAs prediction. Side effects of drugs have gained increasing attention in the biomedical field, and accurate identification of drug side effects is essential for drug development and drug safety surveillance. Although the traditional pharmacological experiments can accurately detect the side effects of drugs, the identifying process is time-consuming, costly, and may lead to incomplete identification of side effects. With the expanding of various biomedical databases, many computational methods have been developed for the task of drug-side effect associations (DSAs) prediction. However, existing methods have the following three drawbacks: 1). multiple drug-related databases are not fully used; 2). the complex semantics among drugs and side effects are not effectively captured; 3). the explainability of the predicted DSAs is missed for most existing methods. Therefore, there is an urgent need to find a more effective method for predicting DSAs. To address these issues, we propose a novel meta-path-based graph neural network model for drug-side effect associations prediction (MPGNN-DSA). In MPGNN-DSA, a heterogeneous information network is first constructed by combining multiple biological datasets. Then, a meta-path-based feature learning module is utilized for learning high-quality representations of drugs and side effects by capturing the semantics contained in meta-paths of the constructed HIN. With the learned features, the prediction module is conducted to derive the predicted side effects for drugs. In addition, the explainability of the predicted DSAs can be provided as well with the semantics contained in meta-paths. We conduct comprehensive experiments, and the results demonstrate the effectiveness of MPGNN-DSA, suggesting that the proposed method will be a feasible solution to the task of DSAs prediction.
期刊:
IEEE/ACM Transactions on Computational Biology and Bioinformatics,2022年19(1):513-521 ISSN:1545-5963
通讯作者:
Zhang, XF
作者机构:
[Tan, Yu-Ting; Zhang, Xiao-Fei] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Hubei, Peoples R China.;[Tan, Yu-Ting; Zhang, Xiao-Fei] Cent China Normal Univ, Hubei Key Lab Math Sci, Wuhan 430079, Hubei, Peoples R China.;[Ou-Yang, Le] Shenzhen Univ, Coll Informat Engn, Shenzhen 518060, Guangdong, Peoples R China.;[Ou-Yang, Le] Shenzhen Univ, Shenzhen Key Lab Media Secur, Shenzhen 518060, Guangdong, Peoples R China.;[Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Zhang, XF ] C;Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Hubei, Peoples R China.;Cent China Normal Univ, Hubei Key Lab Math Sci, Wuhan 430079, Hubei, Peoples R China.
摘要:
It is an important task to learn how gene regulatory networks change under different conditions. Several Gaussian graphical model-based methods have been proposed to deal with this task by inferring differential networks from gene expression data. However, most existing methods define the differential networks as the difference of precision matrices, which may include false differential edges caused by the change of conditional variances. In addition, prior information about the condition-specific networks and the differential networks can be obtained from other domains. It is useful to incorporate prior information into differential network analysis. In this study, we propose a new differential network analysis method to address the above challenges. Instead of using the precision matrices, we define the differential networks as the difference of partial correlations, which can exclude the spurious differential edges due to the variants of conditional variances. Furthermore, prior information from multiple hypothesis testing is incorporated using a weighted fused penalty. Simulation studies show that our method outperforms the competing methods. We also apply our method to identify the differential network between luminal A and basal-like subtypes of breast cancers and the differential network between acute myeloid leukemia tumors and normal samples. The hub genes in the differential networks identified by our method carry out important biological functions.
作者机构:
[Zhong, Duo; Jiang, Xingpeng; Li, Bojing] Cent China Normal Univ, Hubei Key Lab Artificial Intelligence & Smart Lear, Wuhan, Peoples R China.;[Zhong, Duo; Jiang, Xingpeng; Li, Bojing] Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.;[Qiao, Jimei] Shanghai Normal Univ, Math & Sci Coll, Shanghai, Peoples R China.;[Jiang, Xingpeng] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan, Peoples R China.
通讯机构:
[Xingpeng Jiang] H;Hubei Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China<&wdkj&>School of Computer, Central China Normal University, Wuhan, China<&wdkj&>National Language Resources Monitoring & Research Center for Network Media, Central China Normal University, Wuhan, China
摘要:
Microorganisms play important roles in our lives especially on metabolism and diseases. Determining the probability of human suffering from specific diseases and the severity of the disease based on microbial genes is the crucial research for understanding the relationship between microbes and diseases. Previous could extract the topological information of phylogenetic trees and integrate them to metagenomic datasets, thus enable classifiers to learn more information in limited datasets and thus improve the performance of the models. In this paper, we proposed a GNPI model to better learn the structure of phylogenetic trees. GNPI maintained the original vector format of metagenomic datasets, while previous research had to change the input form to matrices. The vector-like form of the input data can be easily adopted in the baseline machine learning models and is available for deep learning models. The datasets processed with GNPI help enhance the accuracy of machine learning and deep learning models in three different datasets. GNPI is an interpretable data processing method for host phenotype prediction and other bioinformatics tasks.
摘要:
The increasing prevalence of antibiotic resistance has become a global health crisis. For the purpose of safety regulation, it is of high importance to identify antibiotic resistance genes (ARGs) in bacteria. Although culture-based methods can identify ARGs relatively more accurately, the identifying process is time-consuming and specialized knowledge is required. With the rapid development of whole genome sequencing technology, researchers attempt to identify ARGs by computing sequence similarity from public databases. However, these computational methods might fail to detect ARGs due to the low sequence identity to known ARGs. Moreover, existing methods cannot effectively address the issue of multidrug resistance prediction for ARGs, which is a great challenge to clinical treatments. To address the challenges, we propose an end-to-end multi-label learning framework for predicting ARGs. More specifically, the task of ARGs prediction is modeled as a problem of multi-label learning, and a deep neural network-based end-to-end framework is proposed, in which a specific loss function is introduced to employ the advantage of multi-label learning for ARGs prediction. In addition, a dual-view modeling mechanism is employed to make full use of the semantic associations among two views of ARGs, i.e. sequence-based information and structure-based information. Extensive experiments are conducted on publicly available data, and experimental results demonstrate the effectiveness of the proposed framework on the task of ARGs prediction.
期刊:
IEEE/ACM Transactions on Computational Biology and Bioinformatics,2022年19(3):1322-1333 ISSN:1545-5963
通讯作者:
Jiang, Xingpeng(xpjiang@mail.ccnu.edu.cn)
作者机构:
[He, Tingting; Jiang, Xingpeng; Ma, Yingjun] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Ma, Yingjun] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Hubei, Peoples R China.;[He, Tingting; Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.;[Tan, Yuting] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Peoples R China.;[Tan, Yuting] Hubei Key Lab Math Sci, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
Central China Normal University, School of Computer, Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Hubei, Wuhan, China
会议名称:
18th Asia Pacific Bioinformatics Conference (APBC)
会议时间:
AUG 18-20, 2020
会议地点:
ELECTR NETWORK
会议主办单位:
[Ma, Yingjun;He, Tingting;Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.^[Ma, Yingjun] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Hubei, Peoples R China.^[He, Tingting;Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.^[Tan, Yuting] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Peoples R China.^[Tan, Yuting] Hubei Key Lab Math Sci, Wuhan 430079, Hubei, Peoples R China.
摘要:
Infectious diseases are currently the most important and widespread health problem, and identifying viral infection mechanisms is critical for controlling diseases caused by highly infectious viruses. Because of the lack of non-interactive protein pairs and serious imbalance between positive and negative sample ratios, the supervised learning algorithm is not suitable for prediction. At the same time, due to the lack of information on viral proteins and significant dissimilarity in sequence, some ensemble learning models have poor generalization ability. In this paper, we propose a Sequence-Based Ensemble Learning (Seq-BEL) method to predict the potential virus-human PPIs. Specifically, based on the amino acid sequence of proteins and the currently known virus-human PPI network, Seq-BEL calculates various features and similarities of human proteins and viral proteins, and then combines these similarities and features to score the potential of virus-human PPIs. The computational results show that Seq-BEL achieves success in predicting potential virus-human PPIs and outperforms other state-of-the-art methods. More importantly, Seq-BEL also has good predictive performance for new human proteins and new viral proteins. In addition, the model has the advantages of strong robustness and good generalization ability, and can be used as an effective tool for virus-human PPI prediction.