期刊:
IEEE Transactions on Industrial Informatics,2023年:1-11 ISSN:1551-3203
通讯作者:
Yang, B;Liu, H
作者机构:
[Yang, Bing; Liu, Tingting] Hubei Univ, Sch Educ, 368 Youyi Rd, Wuhan 430062, Hubei, Peoples R China.;[Yang, Bing; Liu, Tingting] City Univ Hong Kong, Dept Mech Engn, Kowloon, Hong Kong, Peoples R China.;[Zhang, Zhaoli; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.
通讯机构:
[Yang, B ] H;[Liu, H ] C;Hubei Univ, Sch Educ, 368 Youyi Rd, Wuhan 430062, Hubei, Peoples R China.;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.
摘要:
2D Human pose estimation (HPE) has been widely used in the many fields such as behavioral understanding, identity authentication, and industrial automatic manufacturing. Most of the previous studies have encountered many constraints, such as restricted scenarios and strict inputs. To solve this problem, we present a simple yet effective HPE network called limb direction cues-aware network (LDCNet) with limb direction cues and differentiated Cauchy labels, which can efficiently suppress uncertainties and prevent deep networks from over-fitting uncertain keypoint positions. In particular, LDCNet suppresses the uncertainties from two aspects. (1) A differentiated Cauchy coordinate encoding method is designed to reveal the limb direction information among adjacent keypoints. (2) Jeffreys divergence is introduced as loss function to measure the prediction heatmap and ground-truth one. Positions of keypoints are perceived at the limb direction based deep network in an end-to-end manner. An extensive study on two benchmark data sets (i.e., MS COCO and MPII) illustrates the superiority of the proposed LDCNet model over state- of-the-art approaches.
摘要:
Sustained attention is one of the basic abilities of humans to maintain concentration on relevant information while ignoring irrelevant information over extended periods. The purpose of the review is to provide insight into how to integrate neural mechanisms of sustained attention with computational models to facilitate research and application. Although many studies have assessed attention, the evaluation of humans' sustained attention is not sufficiently comprehensive. Hence, this study provides a current review on both neural mechanisms and computational models of visual sustained attention. We first review models, measurements, and neural mechanisms of sustained attention and propose plausible neural pathways for visual sustained attention. Next, we analyze and compare the different computational models of sustained attention that the previous reviews have not systematically summarized. We then provide computational models for automatically detecting vigilance states and evaluation of sustained attention. Finally, we outline possible future trends in the research field of sustained attention.
期刊:
Information Processing & Management,2023年60(1):103106 ISSN:0306-4573
通讯作者:
Jing Wang
作者机构:
[Yang, Shuoqiu; Li, Hao; Hu, Zhuang; Du, Xu; Wang, Jing] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;[Yang, Shuoqiu; Li, Hao; Hu, Zhuang; Du, Xu; Wang, Jing] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.
通讯机构:
[Jing Wang] N;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China<&wdkj&>Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China
关键词:
Bi-hypergraph network;Intelligent education;Knowledge hypergraph;Teaching image annotation;Visual-knowledge features fusion;Visual-knowledge inconsistency
作者机构:
[Ullah, Anwar; Yu, Xinguo; Yu, XG] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Numan, Muhammad] Cent China Normal Univ, Wollongong Joint Inst, Wuhan 430079, Peoples R China.
通讯机构:
[Yu, XG ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.
摘要:
Generating realistic and synthetic video from text is a highly challenging task due to the multitude of issues involved, including digit deformation, noise interference between frames, blurred output, and the need for temporal coherence across frames. In this paper, we propose a novel approach for generating coherent videos of moving digits from textual input using a Deep Deconvolutional Generative Adversarial Network (DD-GAN). The DD-GAN comprises a Deep Deconvolutional Neural Network (DDNN) as a Generator (G) and a modified Deep Convolutional Neural Network (DCNN) as a Discriminator (D) to ensure temporal coherence between adjacent frames. The proposed research involves several steps. First, the input text is fed into a Long Short Term Memory (LSTM) based text encoder and then smoothed using Conditioning Augmentation (CA) techniques to enhance the effectiveness of the Generator (G). Next, using a DDNN to generate video frames by incorporating enhanced text and random noise and modifying a DCNN to act as a Discriminator (D), effectively distinguishing between generated and real videos. This research evaluates the quality of the generated videos using standard metrics like Inception Score (IS), Frechet Inception Distance (FID), Frechet Inception Distance for video (FID2vid), and Generative Adversarial Metric (GAM), along with a human study based on realism, coherence, and relevance. By conducting experiments on Single-Digit Bouncing MNIST GIFs (SBMG), Two-Digit Bouncing MNIST GIFs (TBMG), and a custom dataset of essential mathematics videos with related text, this research demonstrates significant improvements in both metrics and human study results, confirming the effectiveness of DD-GAN. This research also took the exciting challenge of generating preschool math videos from text, handling complex structures, digits, and symbols, and achieving successful results. The proposed research demonstrates promising results for generating coherent videos from textual input.
期刊:
Neural Computing and Applications,2023年35(11):8343-8356 ISSN:0941-0643
通讯作者:
Baolin Yi
作者机构:
[Shen, Xiaoxuan; Wang, Wei; Zhang, Huanyu; Li, Zhifei; Yi, Baolin] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.
通讯机构:
[Baolin Yi] N;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China
关键词:
Link prediction;Knowledge graph embedding;Convolution neural network;Feature interaction;Complex relations
摘要:
Most knowledge graphs(KGs) are large and incomplete graph-structure database, which can be completed by predicting miss links according to the existing knowledge. The mainstream method is knowledge graph embedding (KGE) which is designed to learn low dimensional embedding of entities and relations. However, knowledge graph embedding still faces two major issues: (1) How to generate more expressive embeddings? (2) How to solve semantic polysemy of entities in different relations? In this paper, we propose a novel KG embedding model, RIECN (Relation-based Interactive Embedding Convolutional Network), which achieves high-quality performance and shows some advancements in modeling complex relations. In RIECN, FIR (Feature Interaction Reshaping) method is introduced to increase the feature interactions between entity and relation embeddings to generate more expressive feature maps. In addition, a new method of generating relation-based dynamic convolution filters, RDCF, is proposed. RDCF generates specific relation and hybird-size convolution filters, which enriches the feature maps of each entity improving the accuracy of link prediction task especially in complex relations scenario. We tested the performance of our model on five benchmark datasets. The experimental results show that the RIECN model significantly outperforms recent state-of-the-art models by 0.1–3.2% and 1.1–3.7%, in terms of MMR metric and Hit@1 metric, respectively.
期刊:
Journal of King Saud University - Computer and Information Sciences,2023年35(7):101605 ISSN:1319-1578
通讯作者:
Zhang, Q
作者机构:
[Wang, Xiaoguang; Wang, Shutong] Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China.;[Wang, Shutong] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;[Zhao, Anran] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China.;[Lai, Chenghang] Fudan Univ, Sch Comp Sci, Shanghai 200438, Peoples R China.;[Zhang, Qi; Zhang, Q] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China.
通讯机构:
[Zhang, Q ] C;Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China.
关键词:
Facial expression recognition;Graph convolutional network;Geometry cue;Uncertainty;Emotion label distribution learning
摘要:
Facial expression recognition (FER) task in the wild is challenging due to some uncertainties, such as the ambiguity of facial expressions, subjective annotations, and low-quality facial images. A novel model for FER in-the-wild datasets is proposed in this study to solve these uncertainties. The overview of the proposed method is as follows. First, the facial images are grouped into high and low uncertainties by the pre-trained network. The graph convolutional network (GCN) framework is then used for the facial images with low uncertainty to obtain geometry cues, including the relationship among action units (AUs) and the implicit connection between AUs and expressions, which help predict the probability of the underlying emotional label. The emotion label distribution is produced by combining the predicted latent label probability and the given label. For the facial images with high uncertainty, k-nearest neighbor graphs are built to determine the k facial images in the low uncertainty group with the highest similarity to the given facial image. The emotion label distribution of the given image is then replaced by fusing the emotion label distribution based on the distances between the given image and its adjacent images. Finally, the constructed emotion label distribution facilitates training in a straightforward manner using a convolutional neural network framework to identify facial expressions. Experimental results on RAF-DB, FERPlus, AffectNet, and SFEW2.0 datasets demonstrate that the proposed method achieved superior performance compared to state-of-the-art approaches. (c) 2023 The Author(s). Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
摘要:
Dialogue state tracking (DST) is a core component of task-oriented dialogue systems. Recent works focus mainly on end-to-end DST models that omit the spoken language understanding (SLU) module to directly obtain the dialogue state based on a user’s dialogue. However, the slot information detected by slot filling in SLU is closely tied to the slot–value pair that needs to be updated in DST. Efficient use of the key slot semantic knowledge obtained by slot filling contributes to improving the performance of DST. Based on this idea, we introduce slot filling as a subtask and build an end-to-end joint model to explicitly integrate the slot information detected by slot filling, which further guides DST. In this article, a novel stack-propagation framework with slot filling for multidomain DST is proposed. The stack-propagation framework is introduced to jointly model slot filling and DST. The framework directly feeds the key slot semantic knowledge detected by slot filling into the DST module. In addition, a slot-masked attention mechanism is designed to enable DST to focus on the key slot information obtained by slot filling. When the slot value is updated, a slot–value softcopy mechanism is designed to enhance the influence of the words marked by key slots. Experiments show that our approach outperforms previous methods and performs outstandingly on two benchmark datasets. IEEE
期刊:
IEEE Transactions on Multimedia,2022年24:2449-2460 ISSN:1520-9210
通讯作者:
Fang, S.
作者机构:
[Li, Duantengchuan; Zhang, Zhaoli; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Fang, Shuai] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.;[Lin, Ke] Harbin Inst Technol, Control Sci & Engn, Shenzhen 150001, Peoples R China.;[Wang, Jiazhang] Northwestern Univ, Evanston, IL 60208 USA.
通讯机构:
[Fang, S.] C;Central China Normal University, National Engineering Laboratory For Educational Big Data, Wuhan, China
作者机构:
[Yang, Bing; Liu, Tingting] Hubei Univ, Sch Educ, 368 Youyi Rd, Wuhan 430062, Hubei, Peoples R China.;[Subramanian, Sriram; Liu, Tingting; Zhang, Zhaoli; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Ju, Jianping] Hubei Business Coll, Sch Artificial Intelligence, Wuhan 430079, Peoples R China.;[Tang, Jianyin] Changchun Univ Sci & Technol, Sch Electromech Engn, Changchun 130022, Peoples R China.;[Liu, Hai] UCL, UCL Interact Ctr, London, England.
通讯机构:
[Ju, JP ] H;[Liu, H ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;Hubei Business Coll, Sch Artificial Intelligence, Wuhan 430070, Peoples R China.
作者机构:
[Rong, Wenting; He, Zili; Zhao, Liang; Yang, Qiaolai; Zhu, Xiaoliang] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;[Rong, Wenting; He, Zili; Dai, Zhicheng] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.
通讯机构:
[Liang Zhao; Zhicheng Dai] N;National Engineering Research Center for E-Learning, Central China Normal University, WuHan 430079, PR China<&wdkj&>National Engineering Research Center for Educational Big Data, Central China Normal University, WuHan 430079, PR China
关键词:
Head pose estimation;Standard luminance;Center offset loss;Border adjustment;Feature fusion
摘要:
Head pose estimation (HPE) is widely used in attention detection, behavior analysis, and expression recognition. Nevertheless, in some complex scenes (such as facial occlusion, large head deflection angle, and multi-person in one scene), HPE still has the problem of low estimation accuracy. To solve this problem, we propose a dual position feature fusion method for estimating head pose. First, the RGB input is replaced with a standard luminance, which reduces the effect of extraneous light factors. Subsequently, the center offset loss is used to detect the head and body position, and dynamic adjustment strategy is used to deflate the border, aiming to not only obtain the best confidence level but also improve the capability of multi-person HPE. Finally, the esti-mate results under head position and body position are fused to further reduce the estimate loss. We tested our approach on the popular public AFLW2000, BIWI, and UPNA datasets, the results show the superiority of our approach in solving the occlusion, deflection, and multi-person scene problems.
摘要:
Electromagnetic source imaging (ESI) requires solving a highly ill-posed inverse problem. To seek a unique solution, traditional ESI methods impose various forms of priors that may not accurately reflect the actual source properties, which may hinder their broad applications. To overcome this limitation, in this article, a novel data-synthesized spatiotemporally convolutional encoder-decoder network (DST-CedNet) method is proposed for ESI. The DST-CedNet recasts ESI as a machine learning problem, where discriminative learning and latent-space representations are integrated in a CedNet to learn a robust mapping from the measured electroencephalography/magnetoencephalography (E/MEG) signals to the brain activity. In particular, by incorporating prior knowledge regarding dynamical brain activities, a novel data synthesis strategy is devised to generate large-scale samples for effectively training CedNet. This stands in contrast to traditional ESI methods where the prior information is often enforced via constraints primarily aimed for mathematical convenience. Extensive numerical experiments as well as analysis of a real MEG and epilepsy EEG dataset demonstrate that the DST-CedNet outperforms several state-of-the-art ESI methods in robustly estimating source signals under a variety of source configurations.
摘要:
Knowledge graphs are multi-relational data that contain massive entities and relations. As an effective graph representation technique based on deep learning, graph neural network has reported outstand-ing performance for modeling knowledge graphs in recent studies. However, previous graph neural network-based models have not fully considered the heterogeneity of knowledge graphs. Furthermore, the attention mechanism has demonstrated its great potential in many areas. In this paper, a novel heterogeneous graph neural network framework based on a hierarchical attention mechanism is proposed, including entity-level, relation-level, and self-level attentions. Thus, the proposed model can selectively aggregate informative features and weights them adequately. Then the learned embeddings of entities and relations can be utilized for the downstream tasks. Extensive experimental results on various heterogeneous graph tasks demonstrate the superior performance of the proposed model compared to several state-of-the-art methods. (C) 2022 Elsevier B.V. All rights reserved.
作者:
Li, Zhifei;Liu, Hai*;Zhang, Zhaoli;Liu, Tingting;Xiong, Neal N.
期刊:
IEEE Transactions on Neural Networks and Learning Systems,2022年33(8):3961-3973 ISSN:2162-237X
通讯作者:
Liu, Hai
作者机构:
[Zhang, Zhaoli; Li, Zhifei; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Liu, Tingting] Hubei Univ, Sch Educ, Wuhan 430062, Peoples R China.;[Xiong, Neal N.] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.;[Xiong, Neal N.] Northeastern State Univ, Dept Math & Comp Sci, Tahlequah, OK 74464 USA.
通讯机构:
[Liu, Hai] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.
摘要:
Knowledge graph (KG) embedding aims to study the embedding representation to retain the inherent structure of KGs. Graph neural networks (GNNs), as an effective graph representation technique, have shown impressive performance in learning graph embedding. However, KGs have an intrinsic property of heterogeneity, which contains various types of entities and relations. How to address complex graph data and aggregate multiple types of semantic information simultaneously is a critical issue. In this article, a novel heterogeneous GNNs framework based on attention mechanism is proposed. Specifically, the neighbor features of an entity are first aggregated under each relation-path. Then the importance of different relation-paths is learned through the relation features. Finally, each relation-path-based features with the learned weight values are aggregated to generate the embedding representation. Thus, the proposed method not only aggregates entity features from different semantic aspects but also allocates appropriate weights to them. This method can capture various types of semantic information and selectively aggregate informative features. The experiment results on three real-world KGs demonstrate superior performance when compared with several state-of-the-art methods.
作者:
Zhang, Zhaoli;Li, Zhifei;Liu, Hai;Xiong, Neal N.
期刊:
IEEE Transactions on Knowledge and Data Engineering,2022年34(5):2335-2347 ISSN:1041-4347
通讯作者:
Li, ZF
作者机构:
[Li, Zhifei; Zhang, Zhaoli; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.;[Xiong, Neal N.] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Li, ZF ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.
作者机构:
[Peng, Shixin; Chen, Jingying] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.;[Chen, Xiaohui] State Grid Hunan Elect Power Co Ltd, Informat & Commun Branch, Changsha 410004, Peoples R China.;[Lu, Wei] Air Force Early Warning Acad, Wuhan 430019, Peoples R China.;[Deng, Chao] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan 430074, Peoples R China.
通讯机构:
[Chen, JY ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.
关键词:
Interference alignment;Limited feedback;MIMO;Precoding matrix;Smart grid;Spatial multiplexing gain
期刊:
IEEE ROBOTICS AND AUTOMATION LETTERS,2022年7(2):1976-1983 ISSN:2377-3766
通讯作者:
Li, YF
作者机构:
[Li, Youfu; Xie, Bochen; Deng, Yongjian] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China.;[Shao, Zhanpeng] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Peoples R China.;[Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.
通讯机构:
[Li, YF ] C;City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China.