作者:
Zhang, Yi;Zhou, Guangyou;Xie, Zhiwen;Huang, Jimmy Xiangji
期刊:
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,2022年30:816-828 ISSN:2329-9290
通讯作者:
Zhou, GY
作者机构:
[Zhang, Yi] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Zhou, Guangyou; Zhou, GY] Cent China Normal Univ, Sch Comp Sci, Wuhan 430079, Peoples R China.;[Xie, Zhiwen] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China.;[Huang, Jimmy Xiangji] York Univ, Sch Informat Technol, Informat Retrieval & Knowledge Management Res Lab, Toronto, ON M3J 1P3, Canada.
通讯机构:
[Zhou, GY ] C;Cent China Normal Univ, Sch Comp Sci, Wuhan 430079, Peoples R China.
关键词:
Mathematical models;Decoding;Natural languages;Encoding;Electronic mail;Arithmetic;Task analysis;Math word problem;natural language processing;text mining;representation learning
摘要:
Designing algorithms to solve math word problems (MWPs) is an important research topic in natural language processing and smart education domains. The task of solving MWPs involves transforming math problem texts into math equations. Although recent Graph2Tree-based models, which adopt homogeneous graph encoders to learn quantity representations, have obtained very promising results in generating math equations, they do not consider the heterogeneous issue and the long-distance dependencies of heterogeneous nodes. In this paper, we propose a novel hierarchical heterogeneous graph encoding called HGEN for MWPs. Specifically, HGEN first introduces a heterogeneous graph consisting of a node-level attention layer and a type-aware attention layer to learn the heterogeneous node embedding. HGEN then captures the long-distance dependent information by propagating the multi-hop nodes in a hierarchical manner. We conduct extensive experiments on two popular MWP datasets. Our empirical results show that HGEN significantly outperforms the state-of-the-art Graph2Tree-based models in the literature.
作者:
Zhang, Qixuan;Weng, Xinyi;Zhou, Guangyou;Zhang, Yi;Huang, Jimmy Xiangji
期刊:
Information Processing & Management,2022年59(3):102933 ISSN:0306-4573
通讯作者:
Guangyou Zhou
作者机构:
[Zhou, Guangyou; Zhang, Yi; Weng, Xinyi; Zhang, Qixuan] Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.;[Huang, Jimmy Xiangji] York Univ, Sch Informat Technol, Toronto, ON, Canada.
通讯机构:
[Guangyou Zhou] S;School of Computer, Central China Normal University, Wuhan, China
期刊:
Journal of Advanced Transportation,2021年2021 ISSN:0197-6729
作者机构:
[Lu, Zizhou; Luo, Changri] Cent China Normal Univ, Sch Vocat & Continuing Educ, Wuhan 430079, Peoples R China.;[Zhang, Xinhua] Sch Comp Sci Wuhan Vocat Coll Software & Engn Wuh, Wuhan 430205, Peoples R China.;[He, Tingting; Zhang, Yong] Cent China Normal Univ, Acad Comp Sci, Wuhan 430079, Peoples R China.;[Xiong, Neal] Northeastern State Univ, Dept Math & Comp Sci, Tahlequah, OK USA.
摘要:
With the development of online learning and distance education, online learners' discussions in forums become increasingly effective to facilitate learning. Superposters, who play a more and more important role in forums, have attracted researchers' close attention. The key to the research is how to identify superposters among a large number of participants. Some studies focus on the network interaction of superposters and some content-related features but neglect the basic quality like language expression that a superposter should possess and the learning-related features like learning collaboration. Based on the analysis of online learning corpus, through network interaction and combination of the different features of N-gram, the paper proposed the superposter identification method based on the three primary features including language expression (L), content quality (C), and social network interaction (S) and the eight secondary features including learning collaboration. The paper applied the method in the real online learning forum corpus for identifying 28 preset superposters, achieving the results of P@15=1.0, Avg.P@15=1.0, P@28=0.86, and Avg.P@28=0.95. Experiments showed that this was an effective superposter identification method in online learning forums.
摘要:
In this paper, we propose a new model that combines reinforcement learning and adversarial training to exploit the data generated by distant supervision for named entity recognition. Our model can not only reduce the influence of noise in generated data, but also find more informative instances for training. In the pre-training stage of the model, in order to make full use of the data generated by distant supervision, we use reinforcement learning to select reliable instances to pre-train a classifier. In the training stage of the model, we introduce the adversarial training mechanism, which can not only find more reliable instances to enhance the ability of the classifier, but also use noise data to improve the ability of the model to resist noise. To evaluate the performance of the model, we conduct experiments on two public datasets, Species800 dataset in biology and EC dataset in e-commerce domain. The experimental results show that in Species800 dataset, the F1 score of our model is 1.68% higher than that of baseline, and in EC dataset, the F1 score of our model is 6.32% higher than that of baseline. Compared to the state of art models, our model can achieve comparable performance just using word2vec embedding.
期刊:
ICBDE '20: Proceedings of the 2020 3rd International Conference on Big Data and Education,2020年2:Pages 37–42
作者机构:
[Yong Zhang; Fen Chen; Wufeng Zhang; Haoyang Zuo; Fangyuan Yu] Computer School, Central China Normal University, Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Wuhan, China
会议名称:
978-1-4503-7498-9
会议时间:
April, 2020
会议地点:
London United Kingdom
会议论文集名称:
ICBDE '20: Proceedings of the 2020 The 3rd International Conference on Big Data and Education
摘要:
In order to improve the performance of keyword extraction by enhancing the semantic representations of documents, we propose a method of keyword extraction which exploits the document's internal semantic information and the semantic representations of words pre-trained by massive external documents. Firstly, we utilize the deep learning tool Word2Vec to characterize the external document information, and evaluate the similarity between the words by the cosine distance, thus we obtain the semantic information between words in the external documents. Then, the word-to-word similarity is used to replace the probability transfer matrix in the TextRank of word graph of the target document. At the same time, the information of the title and the abstract of the internal document are exploited to construct the words' semantic graph for keyword extraction. The experiments select the related academic paper data from AMiner as experimental data set. The experimental results show that our method outperforms the TextRank algorithm and the precision, recall and F-score of the five keywords are increased by 28.60%, 10.70% and 12.90% respectively compared to the single TextRank algorithm.
期刊:
ICBDE '20: Proceedings of the 2020 3rd International Conference on Big Data and Education,2020年:Pages 30–36
作者机构:
[Ying Su] College of Information Science and Engineering, Wuchang Shouyi University, Wuhan, China;[Yong Zhang] Computer School, Central China Normal University, Wuhan, Hubei Province, China
会议名称:
978-1-4503-7498-9
会议时间:
April, 2020
会议地点:
London United Kingdom
会议论文集名称:
ICBDE '20: Proceedings of the 2020 The 3rd International Conference on Big Data and Education
摘要:
In this paper, we propose an automatic construction method of subject knowledge graph for educational applications. The subject knowledge graph is constructed based on educational big data by using a bootstrapping strategy to gradually expand knowledge points and connections between them. In this paper two different datasets are used. One is the subject teaching resources such as syllabuses, teaching plans, textbooks and etc., which is used to automatically construct the core of subject knowledge graph so as to reduce the dependence on the manual annotation. Meanwhile the high-quality of subject teaching resources is the guarantee of accuracy of the knowledge graph core. The other dataset is the massive Internet encyclopedia texts, which is used to expand and complete the subject knowledge graph. As to algorithm, this paper utilizes the BERT-BiLSTM-CRF model to automatically identify the subject knowledge points, and then evaluates the relationship between the knowledge points by calculating their semantic similarity, PMI and Normalized Google Distance between them. The experimental results show that BERT-BiLSTM-CRF outperforms the baselines significantly, and the three kinds of relationship evaluation models have achieved good results. Finally, computer science and physics science are taken as examples to construct the subject knowledge graphs successfully, which show the effectiveness of our method.
作者机构:
[Sun, Bo; Pan, Min] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.;[Pan, Min] Hubei Normal Univ, Sch Comp & Informat Engn, Huangshi 435002, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Zhang, Yue; Zhu, Qiang] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[He, Tingting] C;Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
会议名称:
International Conference on Intelligent Computing (ICIC) / Intelligent Computing and Biomedical Informatics (ICBI) Conference - Medical Informatics and Decision Making
会议时间:
AUG 15-18, 2018
会议地点:
PEOPLES R CHINA
会议主办单位:
[Pan, Min;Sun, Bo] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.^[Pan, Min] Hubei Normal Univ, Sch Comp & Informat Engn, Huangshi 435002, Hubei, Peoples R China.^[Zhang, Yue;Zhu, Qiang;He, Tingting;Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.
期刊:
ICBDE '19: Proceedings of the 2019 International Conference on Big Data and Education,2019年:43-47
通讯作者:
Zhang, Yong
作者机构:
[Li, Yu; Zhao, Jingjing; Yang, Liping; Zhang, Yong] Cent China Normal Univ, Comp Sch, Wuhan, Hubei, Peoples R China.
通讯机构:
[Zhang, Yong] C;Cent China Normal Univ, Comp Sch, Wuhan, Hubei, Peoples R China.
会议名称:
International Conference on Big Data and Education (ICBDE)
会议时间:
MAR 30-APR 01, 2019
会议地点:
Univ Greenwich, London, ENGLAND
会议主办单位:
Univ Greenwich
会议论文集名称:
ICBDE'19: Proceedings of the 2019 International Conference on Big Data and Education
关键词:
Knowledge Graph;Entity Relation Extraction;Normalized Google Distance;Intelligent Education;Graph Visualization
摘要:
To make full use of specialized vocabulary in computer science and discover relationships among these words, a Chinese knowledge graph of computer science major is constructed based on the internet web pages, and then the knowledge graph visualization and application for learning guidance based on it are developed. For the construction of computer science knowledge graph, a small amount of important specialized words in computer science are collected manually, and then these words are extended based on Baidu Baike (baike.baidu.com). Thus we get about 3000 specialized words (called entries). The similarity between two entries is calculated based on the Normalized Google Distance (NGD). Once the similarity is greater than a setting value, a link between the two entries is created. Finally the knowledge graph is constructed by these words and links between them. Here the relation type of link is ignored for simplicity. Furthermore the graph visualization is implemented by a tool called sigma.js, and an application for learning guidance is developed by J2EE. Through the application, students can get a visualized overview of computer science major and make a learning plan efficiently. Moreover the application and method of knowledge graph construction can be applied for other majors easily.
摘要:
Microbiome datasets are often comprised of different representations or views which provide complementary information, such as genes, functions, and taxonomic assignments. Integration of multi-view information for clustering microbiome samples could create a comprehensive view of a given microbiome study. Similarity network fusion (SNF) can efficiently integrate similarities built from each view of data into a unique network that represents the full spectrum of the underlying data. Based on this method, we develop a Robust Similarity Network Fusion (RSNF) approach which combines the strength of random forest and the advantage of SNF at data aggregation. The experimental results indicate the strength of the proposed strategy. The method substantially improves the clustering performance significantly comparing to several state-of-the-art methods in several datasets.
摘要:
As software entities that migrate among nodes, mobile agents (MAs) are able to deliver and execute codes for flexible application re-tasking, local processing, and collaborative signal and information processing. In contrast to the conventional wireless sensor network operations based on the client-server computing model, recent research has shown the efficiency of agent-based data collection and aggregation in collaborative and ubiquitous environments. In this paper, we consider the problem of calculating multiple itineraries for MAs to visit source nodes in parallel. Our algorithm iteratively partitions a directional sector zone where the source nodes are included in an itinerary. The length of an itinerary is controlled by the angle of the directional sector zone in such a way that near-optimal routes for MAs can be obtained by selecting the angle efficiently in an adaptive fashion. Simulation results confirm the effectiveness of the proposed algorithm as well as its performance gain over alternative approaches.
作者机构:
[Tan, Liansheng; Zhang, Yongchang] Cent China Normal Univ, Dept Comp Sci, Wuhan 43007, Peoples R China.
通讯机构:
[Tan, Liansheng] C;Cent China Normal Univ, Dept Comp Sci, Wuhan 43007, Peoples R China.
关键词:
Network utility maximization (NUM);Resource allocation;Wireless network;Fairness index;Principle of equality and diminishing marginal utility (PEDMU)
摘要:
In this paper, we study the optimal resource allocation problem in a wireless network, where all types of traffic including best effort and quality of service (QoS; Soft QoS and Hard QoS) are described by a unified utility function. The attacked problem is casted into a network utility maximization (NUM) model. We formulate the fairness index in terms of users’ utility and traffic type parameters, and then study their relationships. Law of diminishing marginal utility is widely accepted in economics. In this paper, we establish the principle of equality and diminishing marginal utility that enables us to find the desired optimal solution to the NUM model by using this principle, correspondingly for the case where the total resource is sufficient and for the case where the total resource is insufficient. We propose some essential theorems and algorithms to find the optimal solution for the above two cases. The proposed algorithms are evaluated via simulation results. The theoretical analysis and simulation results not only validate the efficacy and efficiency of the proposed algorithms but also disclose the relation between the optimal resource allocation and the factors of traffic types, total available resource and user’s channel quality and the relation between fairness and total resource with respect to a certain allocation scheme.
摘要:
Protein-protein interaction plays an important role in understanding biological processes. In order to resolve the parsing error resulted from modal verb phrases and the noise interference brought by appositive dependency, an improved tree kernel-based PPI extraction method is proposed in this paper. Both modal verbs and appositive dependency features are considered to define some relevant processing rules which can effectively optimize and expand the shortest dependency path between two proteins in the new method. On the basis of these rules, the effective optimization and expanding path is used to direct the cutting of constituent parse tree, which makes the constituent parse tree for protein-protein interaction extraction more precise and concise. The experimental results show that the new method achieves better results on five commonly used corpora.
期刊:
Lecture Notes in Computer Science,2015年9403:336-347 ISSN:0302-9743
通讯作者:
Hu, Po
作者机构:
[Hu, Po; Zhang, Yong] Cent China Normal Univ, Sch Comp Sci, Wuhan, Peoples R China.;[He, Jiacong] Medallia Inc, Palo Alto, CA USA.
通讯机构:
[Hu, Po] C;Cent China Normal Univ, Sch Comp Sci, Wuhan, Peoples R China.
会议名称:
8th International Conference on Knowledge Science, Engineering and Management (KSEM)
会议时间:
OCT 28-30, 2015
会议地点:
SW Univ, Fac Comp & Informat Sci, Chongqing, PEOPLES R CHINA
会议主办单位:
SW Univ, Fac Comp & Informat Sci
会议论文集名称:
Lecture Notes in Artificial Intelligence
关键词:
Query-focused multi-document summarization;Manifold ranking;Affinity graph construction
摘要:
Manifold ranking is one of the most competitive approaches for query-focused multi-document summarization. Despite its success for this task, it usually constructs a sentence affinity graph first based on inter-sentence content similarity, and then perform manifold ranking on the graph to score each sentence with the assumption that all the sentences live on a single manifold. Actually, for a document set to be summarized, the distribution of the sentences might form different, but related manifolds. This paper aims to generalize the basic manifold-ranking based approach to the more generic setting by introducing a novel affinity graph to estimate the similarity between sentences, which leverages both the local geometric structures and the contents of sentences jointly. Preliminary experimental results on the DUC datasets demonstrate the good effectiveness of the proposed approach.