作者机构:
[Zhu, Qiang] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Hu, Xiaohua; Pan, Min; Zhu, Qing] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
通讯机构:
[He, Tingting] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
会议名称:
IEEE International Conference on Bioinformatics and Biomedicine (BIBM) - Human Genomics
会议时间:
DEC 03-06, 2018
会议地点:
Madrid, SPAIN
会议主办单位:
[Zhu, Qiang] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Hubei, Peoples R China.^[Zhu, Qing;Pan, Min;Jiang, Xingpeng;Hu, Xiaohua;He, Tingting] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.^[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
会议论文集名称:
IEEE International Conference on Bioinformatics and Biomedicine-BIBM
摘要:
Microorganisms are closely related to human health and have an impact on the development of various diseases. It is extremely significant to identify the relationships between microorganisms and the phenotypes (such as healthy or disease status) by analyzing microbial abundance in personalized medicine. Deep learning allows computational models that composed of multiple processing layers to learn representation of data with multiple levels of abstraction. These methods have improved the state-of-the-art in speech recognition, visual object recognition and object detection. However, current deep models are typically neural networks which are actually multiple layers of parameterized differentiable nonlinear models that can be trained by backpropagation. It is interesting to explore other deep learning models to handle tasks with small sample size and high dimensional data. While a unique feature of microbial data is that it has phylogenetic tree structure information which can be embedded to improve the classification performance. In this work, in order to further improve the metagenomic classification, we propose a deep model named Cascade Deep Forest which keeps the spatial structure between nodes through embedding phylogenetic tree information. Our results demonstrate: 1) the modified cascade structure can enhance the classification performance of Deep Forest; 2) embedding phylogenetic tree information can also improve the classification of the models; 3) Deep Forest achieves highly competitive performance to deep neural networks.
作者机构:
[Jiang, Xingpeng; Yang, Jincai; He, Tingting; Shen, Xianjun; Hu, Xiaohua; Shen, XJ; Gong, Xue] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
会议名称:
IEEE International Conference on Bioinformatics and Biomedicine (BIBM) - Human Genomics
会议时间:
DEC 03-06, 2018
会议地点:
Madrid, SPAIN
会议主办单位:
[Shen, Xianjun;Gong, Xue;Jiang, Xingpeng;Yang, Jincai;He, Tingting;Hu, Xiaohua] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.^[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
会议论文集名称:
IEEE International Conference on Bioinformatics and Biomedicine-BIBM
关键词:
weighted Directed motifs;microbial network;high order structures;motif-based clustering
摘要:
High-order connectivity patterns are essential to understanding the basic structure of complex networks. Network motifs are considered as the basic building blocks of complex networks. From identifying network motifs to discovering higher-order modular organizations by them, it is helpful to study the organization principles and functional modules of the biological networks in a divide-and-conquer manner. However, the current research based on network motifs often neglect the influence of weight in network motifs. In this paper, the concept of weighted motifs was presented and was applied to microbial network. The method was proposed to find the optimal weighted motif in microbial network and analyze the high-order structure of weighted networks based on them. It also proved that the partially weighted motifs can obtain optimal clusters in theory over unweighted ones.
摘要:
Dynamic network is drawing more and more attention due to its potential in capturing time-dependent phenomena such as online public opinion and biological system. Microbial interaction networks that model the microbial system are often dynamic, static analysis methods are difficult to obtain reliable knowledge on evolving communities. To fulfill this gap, a dynamic clustering approach based on evolutionary symmetric nonnegative matrix factorization (ESNMF) is used to analyze the microbiome time-series data. To our knowledge, this is the first attempt to extract dynamic modules across time-series microbial interaction network. ESNMF systematically integrates temporal smoothness cost into the objective function by simultaneously refining the clustering structure in the current network and minimizing the clustering deviation in successive timestamps. We apply the proposed framework on a human microbiome datasets from infants delivered vaginally and ones born via C-section. The proposed method cannot only identify the evolving modules related to certain functions of microbial communities, but also discriminate differences in two kinds of networks obtained from infants delivered vaginally and via C-section.
作者机构:
[Zhu, Qiang] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Hu, Xiaohua; Pan, Min; Hu, XH; Liu, Lei; Li, Bojing] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
会议名称:
IEEE International Conference on Bioinformatics and Biomedicine (BIBM) - Human Genomics
会议时间:
DEC 03-06, 2018
会议地点:
Madrid, SPAIN
会议主办单位:
[Zhu, Qiang] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Hubei, Peoples R China.^[Pan, Min;Liu, Lei;Li, Bojing;He, Tingting;Jiang, Xingpeng;Hu, Xiaohua] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.^[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
会议论文集名称:
IEEE International Conference on Bioinformatics and Biomedicine-BIBM
摘要:
With the rapid advancement of DNA sequencing, metagenomics and metatranscriptomics have made great progress, which deepen our understanding on the human microbiome and its impact on human health and diseases. The microbiome, which is characterized by small samples, high dimensions and complicated relationships with hosts, refers to the species, genes and genomes of the microbiota, as well as the products of the microbiota and the host environment. In fact, many machine learning methods have been used to conduct Microbiome-Wide Association Studies which can link the microbiome with the phenotypes, such as the status of human health and diseases. However, existing methods such as Support Vector Machines (SVMs) have some limitations on deep representation learning with deep architectures which can promote the reuse of features and potentially lead to progressively more abstract features at higher layers of representations. Recently, Deep Neural Networks (DNNs), a kind of deep learning models, are widely used for metagenomic data analysis and can perform well on representation learning. But they are considered as a black box and sufferring from criticisms due to theirs lacking of interpretability. Thus, it is interesting to explore other deep learning models for metagenomic data analysis. In this work, we introduce a deep learning model called Deep Forest to study the microbiome associations and we also present an ensemble method for feature selection. Experimental results show that Deep Forest outperforms the traditional machine learning methods. In addition, compared to DNNs, Deep Forest has better interpretability and less hyperparameters.
期刊:
PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM),2018年:197-200 ISSN:2156-1125
通讯作者:
Hu, XH
作者机构:
[Ma, Yingjun] Cent China Normal Univ, Sch Math & Stat, Wuhan, Hubei, Peoples R China.;[Ge, Leixin] Cent China Normal Univ, Sch Life Sci, Wuhan, Hubei, Peoples R China.;[Ma, Yuanyuan] Anyang Normal Univ, Sch Comp & Informat Engn, Anyang, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Hu, Xiaohua; Hu, XH] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.
通讯机构:
[Hu, XH] C;Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.
会议名称:
IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
会议时间:
DEC 03-06, 2018
会议地点:
Madrid, SPAIN
会议主办单位:
[Ma, Yingjun] Cent China Normal Univ, Sch Math & Stat, Wuhan, Hubei, Peoples R China.^[Ge, Leixin] Cent China Normal Univ, Sch Life Sci, Wuhan, Hubei, Peoples R China.^[Ma, Yuanyuan] Anyang Normal Univ, Sch Comp & Informat Engn, Anyang, Peoples R China.^[Jiang, Xingpeng;He, Tingting;Hu, Xiaohua] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.
会议论文集名称:
IEEE International Conference on Bioinformatics and Biomedicine-BIBM
摘要:
Studies have shown that microRNAs are functionally related to human diseases. However, experimental methods for detecting miRNA-disease associations are both time consuming and laborious. Therefore, a large number of computational models for predicting potential miRNA-disease interaction have been proposed. However, few methods take into account the nonlinear structural similarity of miRNAs (diseases) and effectively integrate multiple similar metrics into one network. In this paper, we propose a kernel-based soft-neighborhood network propagation algorithm (LKSNF) to predict potential miRNA-disease interactions, which not only exploits the potential nonlinear relationship, but also effectively integrates different similar measures of miRNA (disease). The results of the 5-fold cross-validation show that the LKSNF model has significantly better predictive performance than other state-of-the-art methods. Case study further illustrates the effectiveness of LKSNF in predicting new miRNA-disease interactions.
摘要:
The study of microbe-disease associations can be utilized as a valuable material for understanding disease pathogenesis. Developing a highly accurate algorithm model for predicting disease-related microbes will provide a basis for targeted treatment of the disease. In this paper, we propose an approach based on Kernelized Bayesian Matrix Factorization (KBMF) to predict microbe-disease association, based on the Gaussian interaction profile kernel similarity for microbes and diseases. The prediction performance of the method was evaluated by five-fold cross validation. KBMF achieved reliable results which is better than several state-of-the-art methods with around 8% improvement of AUC. Furthermore, case studies have demonstrated the reliability of the method.
期刊:
2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA),2017年:3882-3887 ISSN:2639-1589
通讯作者:
Hu, Xiaohua
作者机构:
[Jiang, Xingpeng; He, Tingting; Shen, Xianjun; Hu, Xiaohua; Gao, Li; Zhu, Xianchao] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.;[Shen, Xianjun; Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
通讯机构:
[Hu, Xiaohua] C;[Hu, Xiaohua] D;Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.;Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
会议名称:
IEEE International Conference on Big Data (IEEE Big Data)
会议时间:
DEC 11-14, 2017
会议地点:
Boston, MA
会议主办单位:
[Shen, Xianjun;Zhu, Xianchao;Jiang, Xingpeng;Gao, Li;He, Tingting;Hu, Xiaohua] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.^[Shen, Xianjun;Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
会议论文集名称:
IEEE International Conference on Big Data
摘要:
Known as phenotypic overlapping, some disease-related symptoms share a common pathological and physiological mechanism. Researchers attempt to visualize the phenotypic relationships between different human diseases from the perspective of machine learning, but traditional visualization methods may be subject to fundamental limitations of metric spaces. Multiple maps t-SNE regularization method, a probabilistic method for visualizing data points in multiple low-dimensional spaces has been proposed to address the limitation. However, the convergence speed is low when apply on the scale dataset. We use the RMSProp with Nesterov momentum method to learn the objective loss function. This method normalize the gradients by applying an exponential moving average of gradient magnitude for each iteration parameter and use Nesterov momentum to counterweigh too high velocities by "peeking ahead" actual objective values in the candidate search direction. This method convergent faster than the original method of convergence speed. Experiments results on several dataset shows that the proposed method outperforms the several version of mm-tSNE with or without regularization, as measured by the neighborhood preservation ratio and error rate. This suggests the modified mm-tSNE regularization can be applied directly in other domain including social, biological and microbiomic datasets.
摘要:
Many datasets that exists in the real world are often comprised of different representations or views which provide complementary information to each other. To integrate information from multiple views, data integration approaches such as nonnegative matrix factorization (NMF) have been developed to combine multiple heterogeneous data simultaneously to obtain a comprehensive representation. In this paper, we proposed a novel variant of symmetric nonnegative matrix factorization (SNMF), called Laplacian regularization based joint symmetric nonnegative matrix factorization (LJ-SNMF) for clustering multi-view data. We conduct extensive experiments on several realistic datasets including Human Microbiome Project data. The experimental results show that the proposed method outperforms other variants of NMF, which suggests the potential application of LJ-SNMF in clustering multi-view datasets. Additionally, we also demonstrate the capability of LJ-SNMF in community finding.
作者机构:
[Jiang, Xingpeng] Cent China Normal Univ, Sch Comp Sci, Wuhan, Hubei, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.;[Xu, Weiwei] Wuhan Univ, Int Sch Software, Wuhan, Hubei, Peoples R China.
通讯机构:
[Jiang, Xingpeng] C;Cent China Normal Univ, Sch Comp Sci, Wuhan, Hubei, Peoples R China.
关键词:
Human microbiome;data integration;data representation;multi-view clustering;nonnegative matrix factorization
摘要:
Microbiome datasets are often comprised of different representations or views which provide complementary information to understand microbial communities, such as metabolic pathways, taxonomic assignments, and gene families. Data integration methods including approaches based on nonnegative matrix factorization (NMF) combine multi-view data to create a comprehensive view of a given microbiome study by integrating multi-view information. In this paper, we proposed a novel variant of NMF which called Laplacian regularized joint non-negative matrix factorization (LJ-NMF) for integrating functional and phylogenetic profiles from HMP. We compare the performance of this method to other variants of NMF. The experimental results indicate that the proposed method offers an efficient framework for microbiome data analysis.
作者机构:
[Jiang, Xingpeng; Xie, Wei; Yang, Jincai; He, Tingting; Shen, Xianjun; Hu, Po; Hu, Xiaohua; Yi, Li] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.;[Yi, Li] Letv Cloud Comp Co Ltd, Beijing, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA.
通讯机构:
[Shen, Xianjun] C;Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China.
关键词:
Protein complexes;Algorithms;Protein interaction networks;Gene expression;Protein interactions;Forecasting;Genetic networks;Yeast
摘要:
How to identify protein complex is an important and challenging task in proteomics. It would make great contribution to our knowledge of molecular mechanism in cell life activities. However, the inherent organization and dynamic characteristic of cell system have rarely been incorporated into the existing algorithms for detecting protein complexes because of the limitation of protein-protein interaction (PPI) data produced by high throughput techniques. The availability of time course gene expression profile enables us to uncover the dynamics of molecular networks and improve the detection of protein complexes. In order to achieve this goal, this paper proposes a novel algorithm DCA (Dynamic Core-Attachment). It detects protein-complex core comprising of continually expressed and highly connected proteins in dynamic PPI network, and then the protein complex is formed by including the attachments with high adhesion into the core. The integration of core-attachment feature into the dynamic PPI network is responsible for the superiority of our algorithm. DCA has been applied on two different yeast dynamic PPI networks and the experimental results show that it performs significantly better than the state-of-the-art techniques in terms of prediction accuracy, hF-measure and statistical significance in biology. In addition, the identified complexes with strong biological significance provide potential candidate complexes for biologists to validate.
作者机构:
[Jiang, Xingpeng; Yang, Jincai; He, Tingting; Shen, Xianjun; Hu, Xiaohua; Chen, Yao] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Shen, Xianjun] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
关键词:
*Disease network;*Heterogeneous network;*Microbe network;*Random walk
摘要:
As we all know, the microbiota show remarkable variability within individuals. At the same time, those microorganisms living in the human body play a very important role in our health and disease, so the identification of the relationships between microbes and diseases will contribute to better understanding of microbes interactions, mechanism of functions. However, the microbial data which are obtained through the related technical sequencing is too much, but the known associations between the diseases and microbes are very less. In bioinformatics, many researchers choose the network topology analysis to solve these problems. Inspired by this idea, we proposed a new method for prioritization of candidate microbes to predict potential disease-microbe association. First of all, we connected the disease network and microbe network based on the known disease-microbe relationships information to construct a heterogeneous network, then we extended the random walk to the heterogeneous network, and used leave-one-out cross-validation and ROC curve to evaluate the method. In conclusion, the algorithm could be effective to disclose some potential associations between diseases and microbes that cannot be found by microbe network or disease network only. Furthermore, we studied three representative diseases, Type 2 diabetes, Asthma and Psoriasis, and finally presented the potential microbes associated with these diseases by ranking candidate disease-causing microbes, respectively. We confirmed that the discovery of the new associations will be a good clinical solution for disease mechanism understanding, diagnosis and therapy.