作者机构:
[Zhang, Zhaoli; Nie, Hanwen; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Liu, Hai] UCL, UCL Interact Ctr, London, England.;[Li, You-Fu; Liu, Hai] City Univ Hong Kong, Dept Mech Engn, Kowloon, Hong Kong, Peoples R China.
通讯机构:
[Zhang, Zhaoli] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.
摘要:
Head pose estimation is an important way to understand human attention in the human-computer interaction. In this paper, we propose a novel anisotropic angle distribution learning (AADL) network for head pose estimation task. Firstly, two key findings are revealed as following: 1) Head pose image variations are different at the yaw and pitch directions with the same pose angle increasing on a fixed central pose; 2) With the fixed angle interval increasing, the image variations increase firstly and then decrease in yaw angle direction. Then, the maximum a posterior technology is employed to construct the head pose estimation network, which includes three parts, such as convolutional layer, covariance pooling layer and output layer. In the output layer, the labels are constructed as the anisotropic angle distributions on the basis of two key findings. And the anisotropic angle distributions are fitted by the 2D Gaussian like distributions (groundtruth labels). Furthermore, the Kullback-Leibler divergence is selected to measure the predication label and the groundtruth one. The features of head pose images are perceived at the AADL-based convolutional neural network in an end-to-end manner. Experimental results demonstrate that the developed AADL-based labels have several advantages, such as robustness for head pose image missing, insensitivity for the motion blur. Moreover, the proposed method has achieved good performance compared to several state-of-the-art methods on the Pointing'04 and CAS_PEAL_R1 databases. (c) 2020 Elsevier B.V. All rights reserved.
摘要:
Knowledge graph embedding aims to learn the embedded representation of entities and relations in knowledge graphs which is very important for the subsequent link prediction task. However, two key issues are existed for learning knowledge graph embedding: 1) How to take full advantage of the deep learning algorithms to generate expressive embeddings? 2) How to solve the polysemy phenomenon caused by multi-relations knowledge graphs that entities and relations show different semantics after involving different predictions? In this article, to tackle the first problem, the multi-layer convolutional networks are adopted to generate features about entities and relations then used to predict candidate entity. Moreover, the representation power of the networks is strengthened by integrating an effective recalibration mechanism which can accentuate informative features selectively. To tackle the second problem, we propose to learn multiple specific interaction embeddings. Instead of directly learning one general embedding to preserve all information for each entity and relation, their interactions are captured to model the cross-semantic influence from relations to entities and from entities to relations. Compared to traditional embedding models, the proposed model can provide more generalization capabilities and effectively capture potential links between entities and relations. Experimental results have revealed that the proposed model achieves the state-of-the-art performance for general evaluation metrics on link prediction tasks. (c) 2020 Elsevier B.V. All rights reserved.
作者机构:
[Zhang, Zhaoli; Nie, Hanwen; Shu, Jiangbo; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Xiong, Naixue] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.
通讯机构:
[Liu, Hai] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.
作者机构:
[Lai, Chenghang; Zhang, Zhaoli; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Li, You-Fu; Liu, Hai] City Univ Hong Kong, Dept Mech Engn, Kowloon, Hong Kong, Peoples R China.
通讯机构:
[Liu, Hai] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;City Univ Hong Kong, Dept Mech Engn, Kowloon, Hong Kong, Peoples R China.
摘要:
Facial expression recognition task as a crucial step for emotion recognition remains an open challenge that due to individual expression correlation/ambiguity. In this paper, to tackle these challenges, a novel model with the correlation emotion label distribution learning is proposed for near-infrared (NIR) facial expression recognition which associates multiple emotions with each expression depend on the similarity of expressions. Firstly, the similarities of the seven basic expressions are calculated, and then guide the correlation emotion label distribution by predicting the latent label probability distribution of the expression. Furthermore, the proposed model can be learned in an end-to-end manner via a constructed convolutional neural network to classify the six basic facial expressions. Experimental results on Oulu_CASIA database demonstrate that the proposed method has achieved the superior performance on NIR expression recognition. (C) 2020 Elsevier B.V. All rights reserved.
作者机构:
[Wang, Xiang; Zhang, Zhaoli; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Wang, Xiang; Zhang, Wei] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.;[Li, You-Fu; Liu, Hai] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China.
通讯机构:
[Zhang, Wei] C;Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.
摘要:
Head pose estimation (HPE) has been widely applied in human attention recognition, robot vision and assistant driving. Infrared (IR) images bear unique advantages of being still effective under visible scenarios, which are resistance to illumination changing and strong penetration. However, the lack of public IR database hinders the research progress in the low illumination environment. In this paper, we establish a first-of-its-kind infrared head pose (IRHP) database and propose a novel convolutional neural network architecture IRHP-Net on the IRHP database. The IRHP database contains 145 kinds of IR head pose images of subjects, and benchmark evaluations are conducted on our database by the facial features based standard HPE classification methods to prove the usability and effectiveness of IRHP database. To extract the adaptive features for the IR images, a novel multi-scale feature fusion descriptor is developed in the proposed IRHP-Net model. Quantitative assessments of the proposed method on the IRHP images demonstrate the significant improvements over the traditional methods. The new proposed IRHP-Net model can be utilized in human attention recognition and intelligent driving assistant system. (c) 2020 Elsevier B.V. All rights reserved.
期刊:
IEEE-ASME Transactions on Mechatronics,2019年24(1):384-394 ISSN:1083-4435
通讯作者:
Liu, Hai
作者机构:
[Liu, Sannyuya; Liu, Tingting; Zhang, Zhaoli; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.;[Liu, Tingting] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA.;[Li, Youfu; Liu, Hai] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China.
通讯机构:
[Liu, Hai] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.
关键词:
FTIR imaging spectrometers;instrumentation;mechatronics industry;optical data processing;robot vision;wavelet transforms
摘要:
Fourier transform infrared (FTIR) imaging spectrometers are often corrupted by the problems of band overlap and random noise during the infrared spectrum acquisition process. Such noise would degrade the quality of the acquired infrared spectrum, limiting the precision of the subsequent processing. In this paper, we present a novel blind reconstruction method with wavelet transform regularizations for infrared spectrum obtained from the aging instrument. Inspired by the finding that the wavelet coefficient distribution of the clean spectrum is sparser than that of the degraded spectrum, a blind reconstruction model for infrared spectrum is proposed in this paper to regularize the distribution of the degraded spectrum by total variation regularization. This method outperforms when suppressing random noise and preserving the spectral structure details. In addition, an effective optimization scheme is introduced in overcoming the issue of formulated optimization. The instrument response function and latent spectrum can be simultaneously estimated through the proposed method that can efficiently mitigate the effects caused by instrument degradation. Finally, extensive experiments on simulated and real noisy infrared spectra are carried out to demonstrate the superiority of the proposed method over the existing state-of-the-art ones. Thus, the reconstructed spectrum will better serve the feature extraction and educational robot infrared vision sensing in industrial applications.
期刊:
Journal of Educational Computing Research,2019年58(1):63-86 ISSN:0735-6331
通讯作者:
Liu, Hai
作者机构:
[Liu, Sannyuya; Li, Zhenhua; Cao, Taihe; Zhang, Zhaoli; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;[Li, Zhenhua] China West Normal Univ, Network & Informat Management Ctr, Nanchong, Peoples R China.;[Liu, Hai] Natl Engn Res Ctr E Learning, 152 Luoyu Rd, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Liu, Hai] N;Natl Engn Res Ctr E Learning, 152 Luoyu Rd, Wuhan 430079, Hubei, Peoples R China.
摘要:
Online learning engagement detection is a fundamental problem in educational information technology. Efficient detection of students’ learning situations can provide information to teachers to help them identify students having trouble in real time. To improve the accuracy of learning engagement detection, we have collected two aspects of students’ behavior data: face data (using adaptive weighted Local Gray Code Patterns for facial expression recognition) and mouse interaction. In this article, we propose a novel learning engagement detection algorithm based on the collected data (students’ behavior), which come from the cameras and the mouse in the online learning environment. The cameras were utilized to capture students’ face images, while the mouse movement data were captured simultaneously. In the process of image data labeling, we built two datasets for classifier training and testing. One took the mouse movement data as a reference, while the other did not. We performed experiments on two datasets using several methods and found that the classifier trained by the former dataset had a better performance, and its recognition rate is higher than that of the latter one (94.60% vs. 91.51%).
作者机构:
[Liu, Tingting; Zhang, Zhaoli; Liu, Sanya; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.;[Liu, Tingting; Liu, Hai] City Univ Hong Kong, Dept Mech Engn, 83 Tat Chee Ave, Kowloon, Hong Kong, Peoples R China.;[Liu, Tingting] Univ Pittsburgh, Sch Educ, Pittsburgh, PA 15260 USA.
通讯机构:
[Liu, Hai] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.;City Univ Hong Kong, Dept Mech Engn, 83 Tat Chee Ave, Kowloon, Hong Kong, Peoples R China.
作者机构:
[Liu, TingTing; Zhang, Zhaoli; Liu, Sanya; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.;[Li, Youfu; Liu, Hai] City Univ Hong Kong, Dept Mech Engn, 83 Tat Chee Ave, Hong Kong, Hong Kong, Peoples R China.
通讯机构:
[Li, Youfu] C;City Univ Hong Kong, Dept Mech Engn, 83 Tat Chee Ave, Hong Kong, Hong Kong, Peoples R China.
摘要:
FTIR spectrometer often suffers from common problems of band overlap and Poisson noises. In this paper, we show that the issue of infrared (IR) spectrum degradation can be considered as a maximum a posterior (MAP) problem and solved by minimized a cost function that includes a likelihood term and two prior terms. In the MAP framework, the likelihood probability density function (PDF) is constructed based on the observed Poisson noise model. A fitted distribution of curvelet transform coefficient is used as spectral prior PDF, and the instrument response function (IRF) prior is described based on a Gauss-Markov function. Moreover, the split Bregman iteration method is employed to solve the resulting minimization problem, which highly reduces the computational load. As a result, the Poisson noises are perfectly removed, while the spectral structure information is well preserved. The novelty of the proposed method lies in its ability to estimate the IRF and latent spectrum in a joint framework, thus eliminating the degradation effects to a large extent. The reconstructed IR spectrum is more convenient for extracting the spectral feature and interpreting the unknown chemical or biological materials. (C) 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement
作者机构:
[Zhang, Zhaoli; Li, Yang; Liu, Hai; Shu, Jiangbo] National Engineering Research Center for E-Learning, Science Hall, Central China Normal University, 152 Luoyu Road, Wuhan, Hubei, China
会议名称:
2018 International Conference on Distance Education and Learning, ICDEL 2018