作者机构:
[Jiang, Xingpeng; He, Tingting; Zhao, Weizhong; Zhao, Yao] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Zhao, Weizhong; Zhao, Yao] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Zhao, Weizhong; Zhao, Yao] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netw, Wuhan 430079, Hubei, Peoples R China.;[Zhao, Weizhong] Guilin Univ Elect Technol, Guangxi Key Lab Trusted Software, Guilin 541004, Peoples R China.;[Zhao, Weizhong] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China.
通讯机构:
[Weizhong Zhao] H;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University , Wuhan, Hubei 430079, China<&wdkj&>School of Computer, Central China Normal University, Wuhan, Hubei 430079, China<&wdkj&>National Language Resources Monitoring & Research Center for Network Media, Central China Normal University , Wuhan, Hubei 430079, China<&wdkj&>Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology , Guilin 541004, China<&wdkj&>Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University , Guilin 541004, China
摘要:
<jats:title>Abstract</jats:title>
<jats:sec>
<jats:title>Motivation</jats:title>
<jats:p>Multiple events extraction from biomedical literature is a challenging task for biomedical community. Usually, biomedical event extraction is modeled as two sub-tasks, trigger identification and argument detection. Most existing methods perform these two sub-tasks sequentially, and fail to make full use of the interaction between them, leading to suboptimal results for multiple biomedical events extraction.</jats:p>
</jats:sec>
<jats:sec>
<jats:title>Results</jats:title>
<jats:p>We propose a novel framework of reinforcement learning (RL) for the task of multiple biomedical events extraction. More specifically, trigger identification and argument detection are treated as main-task and subsidiary-task, respectively. Assigning the event type of triggers (in the main-task) is viewed as the action taken in RL, and the result of corresponding argument detection (i.e. the subsidiary-task) for the identified trigger is used for computing the reward of the taken action. Moreover, the result of the subsidiary-task is modeled as part of environment information in RL to help the procedure of trigger identification. In addition, external biomedical knowledge bases are employed for representation learning of biomedical text, which can improve the performance of biomedical event extraction. Results on two widely used biomedical corpora demonstrate that the proposed framework performs better than the selected baselines on the task of multiple events extraction. The ablation test indicates the contributions of RL and external KBs to the performance improvement in the proposed method. In addition, by modeling multiple events extraction under the RL framework, the supervised information is exploited more effectively than the classical supervised learning paradigm.</jats:p>
<jats:p>Availability and implementation</jats:p>
<jats:p>Source codes will be available at: https://github.com/David-WZhao/BioEE-RL.</jats:p>
</jats:sec>
期刊:
Information Sciences,2021年550:27-40 ISSN:0020-0255
通讯作者:
Zhao, Weizhong;Yang, Jincai
作者机构:
[Yang, Jincai; He, Tingting; Zhang, Jinyong; Zhao, Weizhong; Zhao, WZ] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.;[Yang, Jincai; He, Tingting; Zhang, Jinyong; Zhao, Weizhong; Zhao, WZ] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[He, Tingting; Zhao, Weizhong] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netw, Wuhan 430079, Hubei, Peoples R China.;[Zhao, Weizhong] Guilin Univ Elect Technol, Guangxi Key Lab Trusted Software, Guilin 541004, Peoples R China.;[Zhao, Weizhong] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China.
通讯机构:
[Zhao, WZ; Yang, JC; Zhao, Weizhong] C;[Zhao, Weizhong] G;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.;Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netw, Wuhan 430079, Hubei, Peoples R China.
摘要:
With the rapid development of information technology, the amount of textual data generated in biomedical field becomes larger and larger. Biomedical event extraction, which is a fundamental information extraction task, has gained a growing interest in biomedical community. Although researchers have proposed various approaches to this task, the performance is still undesirable since previous approaches fail to model biomedical documents effectively. In this paper, we propose an end-to-end framework for document-level joint biomedical event extraction. To better capture the complex relationships among contexts in biomedical documents, a two-level modeling approach is introduced for biomedical documents. More specifically, the dependency-based GCN and hypergraph are used to model local context and global context in each biomedical document, respectively. In addition, a fine-grained interaction mechanism is proposed to model effectively the interaction between local and global contexts to learn better contextualized representations for biomedical event extraction. Comprehensive experiments on two widely used datasets are conducted and the results demonstrate the effectiveness of the proposed framework over state-of-the-art methods. (C) 2020 Elsevier Inc. All rights reserved.
作者机构:
[Yuan, Shuai] Cent China Normal Univ, Natl Engn Res Ctr E Leaming, Wuhan 430079, Peoples R China.;[He, Tingting] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;[He, Tingting] Cent China Normal Univ, Informat Retrieval & Knowledge Management Res Lab, Wuhan 430079, Peoples R China.;[Huang, Huan] South Cent Univ Nationalities, Sch Educ, Wuhan 430074, Peoples R China.;[Hou, Rui] South Cent Univ Nationalities, Coll Comp Sci, Wuhan 430074, Peoples R China.
通讯机构:
[He, Tingting] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Informat Retrieval & Knowledge Management Res Lab, Wuhan 430079, Peoples R China.
期刊:
Frontiers in Genetics,2020年11:546210 ISSN:1664-8021
通讯作者:
Jiang, Xingpeng
作者机构:
[Zhu, Qiang] Cent China Normal Univ, Sch Informat Management, Wuhan, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Pan, Min; Zhu, Qiang; Zhu, Qing] Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Pan, Min; Zhu, Qing] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan, Peoples R China.
通讯机构:
[Jiang, Xingpeng] C;Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan, Peoples R China.
摘要:
As a prerequisite step in biomedical event extraction, event trigger identification has attracted growing attention in biomedical research. Existing approaches to biomedical event trigger identification have two major drawbacks: (1) each sentence in a biomedical document is handled separately, which ignores the global context; (2) they fail to treat the issue of imbalanced class which is induced by the sparseness of event triggers in biomedical documents. To improve the performance of biomedical event trigger identification, we propose a deep neural network-based framework which addresses effectively the two mentioned challenges accordingly. Specifically, the syntactic dependency tree and hierarchical attention mechanism are utilised to model both local and global contexts. Moreover, we propose an adaptive cost learning method to address the class imbalance issue in biomedical event trigger identification. Extensive experiments are conducted on two real-world data sets, and the results demonstrate the effectiveness of the proposed framework.
摘要:
Dialogue intent classification plays a significant role in human-computer interaction systems. In this paper, we present a hybrid convolutional neural network and bidirectional gated recurrent unit neural network (CNN-BGRU) architecture to classify the intent of a dialogue utterance. First, character embeddings are trained and used as the inputs of the proposed model. Second, a CNN is used to extract local features from each utterance, and a maximum pooling layer is applied to select the most crucial latent semantic factors. A bidirectional gated recurrent unit (BGRU) layer architecture is used to capture the contextual semantic information. Then, two feature maps, which are the outputs of the two architectures, are integrated into the final utterance representation. The proposed model can utilize local semantic and contextual information to recognize and classify the user dialogue intent in an efficient way. The proposed model is evaluated based on a social media processing (SMP) data set and a real conversational data set. The experimental results show that the proposed model outperforms the corresponding traditional methods. In addition, compared to the CNN and BGRU methods, the classification accuracy of the proposed model is 1.4% higher for the SMP data set.
摘要:
Pseudo-relevance feedback is a well-studied query expansion technique in which it is assumed that the top-ranked documents in an initial set of retrieval results are relevant and expansion terms are then extracted from those documents. When selecting expansion terms, most traditional models do not simultaneously consider term frequency and the co-occurrence relationships between candidate terms and query terms. Intuitively, however, a term that has a higher co-occurrence with a query term is more likely to be related to the query topic. In this article, we propose a kernel co-occurrence-based framework to enhance retrieval performance by integrating term co-occurrence information into the Rocchio model and a relevance language model (RM3). Specifically, a kernel co-occurrence-based Rocchio method (KRoc) and a kernel co-occurrence-based RM3 method (KRM3) are proposed. In our framework, co-occurrence information is incorporated into both the factor of the term discrimination power and the factor of the within-document term weight to boost retrieval performance. The results of a series of experiments show that our proposed methods significantly outperform the corresponding strong baselines over all data sets in terms of the mean average precision and over most data sets in terms of P@10. A direct comparison of standard Text Retrieval Conference data sets indicates that our proposed methods are at least comparable to state-of-the-art approaches.
期刊:
Lecture Notes in Computer Science,2020年12435:79-90 ISSN:0302-9743
作者机构:
Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, Hubei 430079, China;School of Computer, Central China Normal University, Wuhan, Hubei 430079, China
会议名称:
9th International Conference on Health Information Science, HIS 2020
会议时间:
20 October 2020 through 23 October 2020
会议论文集名称:
Health Information Science
关键词:
Knowledge representation;Natural language processing systems;Text mining;Biomedical literature;Biomedical text minings;Disease associations;Heterogeneous data;Human microbiome;Knowledge graphs;Multi-Sources;Wikipedia;Bacteria
期刊:
Information Processing & Management,2020年57(6):102342 ISSN:0306-4573
通讯作者:
He, Tingting
作者机构:
[Wang, Junmei] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Peoples R China.;[Pan, Min] Hubei Normal Univ, Sch Comp & Informat Engn, Huangshi 435002, Hubei, Peoples R China.;[He, Tingting; Wang, Xueyan; Tu, Xinhui; Huang, Xiang] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;[He, Tingting; Wang, Junmei; Wang, Xueyan; Tu, Xinhui; Pan, Min; Huang, Xiang] Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Peoples R China.;[He, Tingting; Wang, Junmei; Wang, Xueyan; Tu, Xinhui; Pan, Min; Huang, Xiang] Natl Language Resources Monitor & Res Ctr Network, Wuhan 430079, Peoples R China.
通讯机构:
[He, Tingting] C;[He, Tingting] H;[He, Tingting] N;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Peoples R China.
关键词:
Information retrieval;Pseudo-relevance feedback;Semantic matching;Text similarity
摘要:
Pseudo-relevance feedback (PRF) is a well-known method for addressing the mismatch between query intention and query representation. Most current PRF methods consider relevance matching only from the perspective of terms used to sort feedback documents, thus possibly leading to a semantic gap between query representation and document representation. In this work, a PRF framework that combines relevance matching and semantic matching is proposed to improve the quality of feedback documents. Specifically, in the first round of retrieval, we propose a reranking mechanism in which the information of the exact terms and the semantic similarity between the query and document representations are calculated by bidirectional encoder representations from transformers (BERT); this mechanism reduces the text semantic gap by using the semantic information and improves the quality of feedback documents. Then, our proposed PRF framework is constructed to process the results of the first round of retrieval by using probability-based PRF methods and language-model-based PRF methods. Finally, we conduct extensive experiments on four Text Retrieval Conference (TREC) datasets. The results show that the proposed models outperform the robust baseline models in terms of the mean average precision (MAP) and precision P at position 10 (P@10), and the results also highlight that using the combined relevance matching and semantic matching method is more effective than using relevance matching or semantic matching alone in terms of improving the quality of feedback documents.
作者机构:
[Ying, Zhiwei] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Hubei, Peoples R China.;[Huang, Jimmy Xiangji; Ying, Zhiwei] York Univ, Sch Informat Technol, Informat Retrieval & Knowledge Management Res Lab, Toronto, ON M3J 1P3, Canada.;[Zhou, Jie] East China Normal Univ, Dept Comp Sci & Technol, Shanghai 200062, Peoples R China.;[Jian, Fanghong] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.;[He, Tingting] Cent China Normal Univ, Sch Comp Sci, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Huang, Jimmy Xiangji] Y;York Univ, Sch Informat Technol, Informat Retrieval & Knowledge Management Res Lab, Toronto, ON M3J 1P3, Canada.
关键词:
digital signal processing;Information retrieval;probabilistic and statistical models
摘要:
Recently, researchers mainly focus on three categories of models in the field of Information Retrieval (IR), namely vector-space models, probabilistic models, and statistical language models. The existing studies have always developed IR models through refining or combining these traditional models. However, some new frameworks (e.g., digital signal processing (DSP)-based IR framework) have not been well-developed. In our research, we propose a new DSP-based IR Framework (DSPF) introducing the theories from the field of the DSP and present two corresponding DSP-based IR models, denoted as DSPF-BM25 and DSPF-DLM, which incorporate the term weighting schemes from two well-performed probabilistic IR models, the BM25, and the Dirichlet Language Model (DLM). In particular, first, we consider each query term as a spectrum with Gaussian form. Second, instead of transforming the signals from the time domain to frequency domain, we directly represent the query terms in the frequency domain. It is much more controllable and precise to adjust the values of the parameters for getting better performance of our proposed models. To testify the effectiveness of our proposed models, we conduct extensive experiments on seven standard datasets. The results show that in most cases our proposed models outperform the strong baselines in terms of MAP.
作者机构:
[Ma, Yingjun] Cent China Normal Univ, Sch Math & Stat, Wuhan, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Ma, Yingjun] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.
通讯机构:
[Jiang, Xingpeng] C;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan, Hubei, Peoples R China.;Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.
摘要:
Many long ncRNAs (lncRNA) make their effort by interacting with the corresponding RNA-binding proteins, and identifying the interactions between lncRNAs and proteins is important to understand the functions of lncRNA. Compared with the time-consuming and laborious experimental methods, more and more computational models are proposed to predict lncRNA-protein interactions. However, few models can effectively utilize the biological network topology of lncRNA (protein) and combine its sequence structure features, and most models cannot effectively predict new proteins (lncRNA) that do not interact with any lncRNA (proteins). In this study, we proposed a projection-based neighborhood non-negative matrix decomposition model (PMKDN) to predict potential lncRNA-protein interactions by integrating multiple biological features of lncRNAs (proteins). First, according to lncRNA (protein) sequences and lncRNA expression profile data, we extracted multiple features of lncRNA (protein). Second, based on protein GO ontology annotation, lncRNA sequences, lncRNA(protein) feature information, and modified lncRNA-protein interaction network, we calculated multiple similarities of lncRNA (protein), and fused them to obtain a more accurate lncRNA(protein) similarity network. Finally, combining the similarity and various feature information of lncRNA (protein), as well as the modified interaction network, we proposed a projection-based neighborhood non-negative matrix decomposition algorithm to predict the potential lncRNA-protein interactions. On two benchmark datasets, PMKDN showed better performance than other state-of-the-art methods for the prediction of new lncRNA-protein interactions, new lncRNAs, and new proteins. Case study further indicates that PMKDN can be used as an effective tool for lncRNA-protein interaction prediction.
期刊:
Frontiers in Genetics,2019年10:491009 ISSN:1664-8021
通讯作者:
Jiang, Xingpeng
作者机构:
[Zhu, Qiang] Cent China Normal Univ, Sch Informat Management, Wuhan, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Pan, Min; Zhu, Qiang; Zhu, Qing] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Pan, Min; Zhu, Qing] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan, Hubei, Peoples R China.
通讯机构:
[Jiang, Xingpeng] C;Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan, Hubei, Peoples R China.
摘要:
The microbiome-wide association studies are to figure out the relationship between microorganisms and humans, with the goal of discovering relevant biomarkers to guide disease diagnosis. However, the microbiome data is complex, with high noise and dimensions. Traditional machine learning methods are limited by the models' representation ability and cannot learn complex patterns from the data. Recently, deep learning has been widely applied to fields ranging from text processing to image recognition due to its efficient flexibility and high capacity. But the deep learning models must be trained with enough data in order to achieve good performance, which is impractical in reality. In addition, deep learning is considered as black box and hard to interpret. These factors make deep learning not widely used in microbiome-wide association studies. In this work, we construct a sparse microbial interaction network and embed this graph into deep model to alleviate the risk of overfitting and improve the performance. Further, we explore a Graph Embedding Deep Feedforward Network (GEDFN) to conduct feature selection and guide meaningful microbial markers' identification. Based on the experimental results, we verify the feasibility of combining the microbial graph model with the deep learning model, and demonstrate the feasibility of applying deep learning and feature selection on microbial data. Our main contributions are: firstly, we utilize different methods to construct a variety of microbial interaction networks and combine the network via graph embedding deep learning. Secondly, we introduce a feature selection method based on graph embedding and validate the biological meaning of microbial markers. The code is available at https://github.com/MicroAVA/GEDFN.git.
作者机构:
[Sun, Bo; Pan, Min] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.;[Pan, Min] Hubei Normal Univ, Sch Comp & Informat Engn, Huangshi 435002, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Zhang, Yue; Zhu, Qiang] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[He, Tingting] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
会议名称:
International Conference on Intelligent Computing (ICIC) / Intelligent Computing and Biomedical Informatics (ICBI) Conference - Medical Informatics and Decision Making
会议时间:
AUG 15-18, 2018
会议地点:
PEOPLES R CHINA
会议主办单位:
[Pan, Min;Sun, Bo] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.^[Pan, Min] Hubei Normal Univ, Sch Comp & Informat Engn, Huangshi 435002, Hubei, Peoples R China.^[Zhang, Yue;Zhu, Qiang;He, Tingting;Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
摘要:
BACKGROUND: In order to better help doctors make decision in the clinical setting, research is necessary to connect electronic health record (EHR) with the biomedical literature. Pseudo Relevance Feedback (PRF) is a kind of classical query modification technique that has shown to be effective in many retrieval models and thus suitable for handling terse language and clinical jargons in EHR. Previous work has introduced a set of constraints (axioms) of traditional PRF model. However, in the feedback document, the importance degree of candidate term and the co-occurrence relationship between a candidate term and a query term. Most methods do not consider both of these factors. Intuitively, terms that have higher co-occurrence degree with a query term are more likely to be related to the query topic. METHODS: In this paper, we incorporate original HAL model into the Rocchio's model, and propose a new concept of term proximity feedback weight. A HAL-based Rocchio's model in the query expansion, called HRoc, is proposed. Meanwhile, we design three normalization methods to better incorporate proximity information to query expansion. Finally, we introduce an adaptive parameter to replace the length of sliding window of HAL model, and it can select window size according to document length. RESULTS: Based on 2016 TREC Clinical Support medicine dataset, experimental results demonstrate that the proposed HRoc and HRoc_AP models superior to other advanced models, such as PRoc2 and TF-PRF methods on various evaluation metrics. Among them, compared with the Proc2 and TF-PRF models, the MAP of our model is increased by 8.5% and 12.24% respectively, while the F1 score of our model is increased by 7.86% and 9.88% respectively. CONCLUSIONS: The proposed HRoc model can effectively enhance the precision and the recall rate of Information Retrieval and gets a more precise result than other models. Furthermore, after introducing self-adaptive parameter, the advanced HRoc_AP model uses less hyper-parameters than other models while enjoys an equivalent performance, which greatly improves the efficiency and applicability of the model and thus helps clinicians to retrieve clinical support document effectively.
作者机构:
[Jiang, Xingpeng; He, Tingting; Fu, Chengcheng; Li, Xusheng; Zhong, Duo] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Fu, Chengcheng; Li, Xusheng; Zhong, Duo] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan, Hubei, Peoples R China.;[Zhong, Ran] Cent China Normal Univ, Collaborat & Innovat Ctr, Wuhan, Hubei, Peoples R China.
通讯机构:
[Jiang, Xingpeng] C;Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan, Hubei, Peoples R China.
会议名称:
IEEE International Conference on Bioinformatics and Biomedicine (BIBM) - Bioinformatics and Systems Biology
会议时间:
DEC 03-06, 2018
会议地点:
Madrid, SPAIN
会议主办单位:
[Li, Xusheng;Fu, Chengcheng;Zhong, Duo;He, Tingting;Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.^[Li, Xusheng;Fu, Chengcheng;Zhong, Duo;He, Tingting;Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan, Hubei, Peoples R China.^[Zhong, Ran] Cent China Normal Univ, Collaborat & Innovat Ctr, Wuhan, Hubei, Peoples R China.
关键词:
Named entity recognition;Biomedical text mining;Conditional random field;Deep learning
作者机构:
[Ma, Yingjun] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Zhang, Chenhao] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.;[Ge, Leixin] Cent China Normal Univ, Sch Life Sci, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Jiang, Xingpeng] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.
会议名称:
IEEE International Conference on Bioinformatics and Biomedicine (BIBM) - Medical Genomics
会议时间:
DEC 03-06, 2018
会议地点:
Madrid, SPAIN
会议主办单位:
[Ma, Yingjun] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Hubei, Peoples R China.^[He, Tingting;Zhang, Chenhao;Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.^[He, Tingting;Jiang, Xingpeng] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.^[Ge, Leixin] Cent China Normal Univ, Sch Life Sci, Wuhan 430079, Hubei, Peoples R China.
摘要:
BACKGROUND: Studies have shown that miRNAs are functionally associated with the development of many human diseases, but the roles of miRNAs in diseases and their underlying molecular mechanisms have not been fully understood. The research on miRNA-disease interaction has received more and more attention. Compared with the complexity and high cost of biological experiments, computational methods can rapidly and efficiently predict the potential miRNA-disease interaction and can be used as a beneficial supplement to experimental methods. RESULTS: In this paper, we proposed a novel computational model of kernel neighborhood similarity and multi-network bidirectional propagation (KNMBP) for miRNA-disease interaction prediction, especially for new miRNAs and new diseases. First, we integrated multiple data sources of diseases and miRNAs, respectively, to construct a novel disease semantic similarity network and miRNA functional similarity network. Secondly, based on the modified miRNA-disease interactions, we use the kernel neighborhood similarity algorithm to calculate the disease kernel neighborhood similarity and the miRNA kernel neighborhood similarity. Finally, we utilize bidirectional propagation algorithm to predict the miRNA-disease interaction scores based on the integrated disease similarity network and miRNA similarity network. As a result, the AUC value of 5-fold cross validation for all interactions by KNMBP is 0.93126 based on the commonly used dataset, and the AUC values for all interactions, for all miRNAs, for all disease is 0.937950.863630.86937 based on another dataset extracted by ourselves, which are higher than other state-of-the-art methods. In addition, our model has good parameter robustness. The case study further demonstrated the predictive performance of the model for novel miRNA-disease interactions. CONCLUSIONS: Our KNMBP algorithm efficiently integrates multiple omics data from miRNAs and diseases to stably and efficiently predict potential miRNA-disease interactions. It is anticipated that KNMBP would be a useful tool in biomedical research.
摘要:
In this paper, we study domain adaptation of semantic role classification. Most systems utilize the supervised method for semantic role classification. But, these methods often suffer severe performance drops on out-of-domain test data. The reason for the performance drops is that there are giant feature differences between source and target domain. This paper proposes a framework called Adversarial Domain Adaption Network (ADAN) to relieve domain adaption of semantic role classification. The idea behind our method is that the proposed framework can derive domain-invariant features via adversarial learning and narrow down the gap between source and target feature space. To evaluate our method, we conduct experiments on English portion in the CoNLL 2009 shared task. Experimental results show that our method can largely reduce the performance drop on out-of-domain test data.