作者机构:
[Ma, Yingjun] Xiamen Univ Technol, Sch Appl Math, Xiamen, Peoples R China.;[Ma, Yuanyuan] Anyang Normal Univ, Sch Comp & Informat Engn, Anyang, Peoples R China.;[Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan, Peoples R China.;[Jiang, Xingpeng] Cent China Normal Univ, Hubei Key Lab Artificial Intelligence & Smart Lear, Wuhan, Peoples R China.
通讯机构:
[Xingpeng Jiang] S;School of Computer, Central China Normal University, Wuhan, China<&wdkj&>Hubei Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China
摘要:
Microbial community is an important part of organisms or ecosystems to maintain health and stability. Analyzing the interaction of microorganisms in the ecosystem and mining the co-occurrence module of the microbial community can deepen the understanding of microbial community function. This could also improve the ability to manipulate the microbial community, thus provide new means for ecological restoration, disease treatment and drug development. Instead of the investigations of pairwise relationships, more and more studies have realized that the higher-order interactions may play important roles in explaining the diversity and complexity of the community. In this study, a hypergraph clustering (HCMFP) based on modularity feature projection is proposed to detect the microbial community in higher-order interaction network among microbes. Specifically, HCMFP uses information entropy to mine the higher-order logical relationships among microbes, and constructs a hypergraph learning model based on modularity feature projection to detect the microbial community. The experimental results show that compared with other methods, HCMFP has better clustering performance and reliable convergence speed. The proposed method is an effective tool for high-order organizations in microbial interaction network. The code and data in this study is freely available at https://github.com/Mayingjun20179/ HCMFP.
摘要:
Online learning has developed rapidly, but the success rate is very low. Hence, it is of great significance to construct a learning result predicting model, and to quickly and accurately identify students at risk of failing their course. In order to mine the dynamic features of learning behaviors and use them to improve the accuracy of detection of at-risk students, we propose a long-short term memory (LSTM) network based approach to identify at-risk students. To validate the performance of this approach, we first extracted the behavior data of one course from a public dataset, and generate two types of datasets, the aggregated datasets and the sequential datasets. After that, we used eight classic machine learning methods to train predicting model on these datasets and explored whether the models trained on sequential datasets are more accurate than the models trained on aggregated datasets. The results show that the models trained on sequential datasets are more accurate when naïve Bayes, Classification and Regression Tree, Random Forest (RF), Iterative Dichotomiser 3 and Multilayer Perception are used. Finally, we used the LSTM to train predicting models on sequential datasets, and compared them with the best models trained by RF. The results show that the models trained by the LSTM are more accurate, which proves the effectiveness of the proposed approach at certain extent.
摘要:
Oracle bone inscriptions (OBIs) are ancient Chinese scripts originated in the Shang Dynasty of China, and now less than half of the existing OBIs are well deciphered. To date, interpreting OBIs mainly relies on professional historians using the rules of OBIs evolution, and the remaining part of the oracle's deciphering work is stuck in a bottleneck period. Here, we systematically analyze the evolution process of oracle characters by using the Siamese network in Few-shot learning (FSL). We first establish a dataset containing Chinese characters which have finished a relatively complete evolution, including images in five periods: oracle bone inscriptions, bronze inscriptions, seal inscriptions, official script, and regular script. Then, we compare the performance of three typical algorithms, VGG16, ResNet, and AlexNet respectively, as the backbone feature extraction network of the Siamese network. The results show that the highest F1 value of 83.3% and the highest recognition accuracy of 82.67% are obtained by the combination of VGG16 and Siamese network. Based on the analysis, the typical structural performance of each period is evaluated and we identified that the optimized Siamese network is feasible to study the evolution of the OBIs. Our findings provide a new approach for oracle's deciphering further.
作者机构:
[Han, Xiaofeng; Zhang, Jiahua; Cheng, Can; Liang, Peng; Li, Bing; Li, B] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China.;[Li, Zengyang] Cent China Normal Univ, Sch Comp Sci, Wuhan, Peoples R China.
通讯机构:
[Liang, P ; Li, B] W;Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China.
关键词:
Open source software project;GitHub;Public development project
摘要:
With available tools and datasets existing on GitHub ecosystem, researchers have the opportunities to study diverse software engineering problems on a large-scale dataset. However, there are many potential threats when researchers try to directly use large-scale datasets, and one important threat is that GitHub contains many private projects (e.g., homework) and non-development projects (e.g., blog). For researchers who want to study cooperative behavior of developers or development process of projects, their research samples should not contain private projects and non-development projects. To solve this problem, we first analyzed the weaknesses of the base line methods (i.e., selecting top projects) and extended ML-based methods (i.e., training models on a labeled training dataset using ML algorithms, Extended_MLMs for short), and proposed two methods called Enhanced_RFM and Fusion_DL_RFM to address the weaknesses of Extended_RFM (the Extended_MLM that is based on Random Forest and has the best performance among all the Extended_MLMs). The results show that: (1) existing project sample selection methods have a low F-measure and poor generality (i.e., have a bad performance on the testing dataset); (2) Enhanced_RFM outperforms Fusion_DL_RFM on accuracy and stability; and (3) by adopting Enhanced_RFM, the F-measure of Extended_RFM is improved from 0.690 to 0.810 and the precision of Extended_RFM is improved from 0.559 to 0.785 under cross validation, which indicates that the generality of Extended_RFM is significantly improved.
作者机构:
[Jin, Cong] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
通讯机构:
[Cong Jin] S;School of Computer, Central China Normal University, Wuhan, China
关键词:
MOOC;Dropout prediction;Initial weight calculation;Intelligent optimization;Clickstream data
摘要:
Currently, the high dropout rate of massive open online course (MOOC) has seriously affected its popularity and promotion. How to effectively predict the dropout status of students in MOOC so as to intervene as early as possible has become a hot topic. As we know, different students in MOOC have big differences in learning behaviors, learning habits, and learning time, etc. This leads to different student samples having different effects on the prediction performance of the machine learning-based dropout prediction model (DPM). This is because the performance of machine learning-based classifiers heavily depends on the quality of training samples. To solve this problem, in this paper, a new DPM based on machine learning is proposed. Since the traditional neighborhood concept has nothing to do with the label of the sample, a new neighborhood definition, i.e., the max neighborhood, is first given. It is not only related to the distance between samples, but also related to the labels of the samples. Then, the calculation and realization algorithm of the initial weight of each student sample is studied based on the definition of the max neighborhood, which is different from the commonly methods of randomly selecting initial values. Next, the optimization method of the initial weight of the student sample is further studied using the intelligent optimization method. Finally, the classifiers trained by the weighted training samples are used as DPM. Experimental results of direct observation and statistical testing on public data sets indicate that the training sample weighting and intelligent optimization technology can significantly improve the predictive performance of DPM.
摘要:
The top-k augmented spatial keyword query (TkASKQ) retrieves k objects with the highest scores based on a scoring function, which considers spatial proximity, textual similarity and attribute matching simultaneously. As far as we know, no work has been conducted on answering why-not questions on TkASKQ queries (WTkASKQ). This paper takes the first step to address WTkASKQ queries by adopting a Query Refinement model. Specifically, we propose a hybrid indexing structure, A(k)C, which adopts a two-level partitioning scheme, to efficiently organize the textual, attribute, and spatial information of objects. Based on A(k)C, several filtering strategies are proposed to prune unqualified objects for query processing. To limit the number of refined queries to be explored, we construct new refined queries by sequentially extracting new keywords and attribute-value pairs from missing objects and adding them to the original keyword and attribute-value sets, respectively, so as to efficiently obtain the best refined query with minimal modification cost. In addition, we discuss the applicability of the methods in handling why-not questions on augmented regional queries, ordinary top-k SKQ queries and complex scoring queries. Experimental result shows that our A(k)C-based method has higher query efficiency compared with other baseline methods. (C) 2021 Elsevier B.V. All rights reserved.
期刊:
INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS,2021年34(3):e4698.1-e4698.15 ISSN:1074-5351
通讯作者:
Tang, Shengda
作者机构:
[Tang, Shengda] Guangxi Normal Univ, Coll Math & Stat, Guilin, Guangxi, Peoples R China.;[Tan, Liansheng] Cent China Normal Univ, Dept Comp Sci, Wuhan, Peoples R China.;[Liu, Tao] Southwest Minzu Univ, Sch Comp Sci & Technol, Chengdu, Peoples R China.
通讯机构:
[Tang, Shengda] G;Guangxi Normal Univ, Coll Math & Stat, Guilin, Guangxi, Peoples R China.
关键词:
Energy harvesting wireless communication system (EH-WCS);Markov fluid flow model;Performance metrics;reliable energy backup (REB)
摘要:
Energy harvesting wireless communication system (EH-WCS) has the capability of harvesting energy for system operations from the surrounding renewable energy sources. However, the randomness and instability of the harvested energy will result in the depletion of the energy consumption. To provide reliable communication services with the quality of service (QoS) guarantee, it is necessary for the EH-WCS to use a reliable energy backup (REB) for supplying energy to the system during the failure of its primary energy source. In this paper, a novel stochastic model, i.e., the extended Markov fluid flow model, is proposed to describe the EH-WCS with REB. The Kolmogorov forward equations of the system model are derived. By solving the corresponding equations, we obtain the stationary distributions of the key performance metrics for the EH-WCS with REB, including the average energy consumption rate of the EH-WCS, the residual energy distribution, the average energy supply rate by REB, the packet queue length in data buffer, the data queue delay, and the packet blocking probability. A numerical example is provided to investigate the theoretical results, and the effects of the system parameters on the performance are further studied numerically. Both the theoretical insights and the numerical analyses are believed to be important for the design of EH-WCSs.
期刊:
IET Communications,2021年15(2):328-336 ISSN:1751-8628
通讯作者:
Hsu, Chingfang
作者机构:
[Zhang, Maoyuan; Hsu, Chingfang; Zhao, Zhuo] Cent China Normal Univ, Comp Sch, Wuhan 430079, Peoples R China.;[Harn, Lein] Univ Missouri, Dept Comp Sci Elect Engn, Kansas City, MO 64110 USA.;[Xia, Zhe] Wuhan Univ Technol, Dept Comp Sci, Wuhan, Peoples R China.
通讯机构:
[Hsu, Chingfang] C;Cent China Normal Univ, Comp Sch, Wuhan 430079, Peoples R China.
关键词:
Cryptography;Protocols;Mobile radio systems;Wireless sensor networks
摘要:
Group-oriented applications show its potential ability in the next generation of wireless sensor networks (5G WSNs), which have the particularity of being heterogeneous and so have different capabilities in terms of storage, computing, communicating and energy. One of the main challenges for secure group-oriented applications (SGA) in 5G WSNs is how to secure communication between these heterogeneous devices. Conventional protocols are not suitable for SGA in 5G sensor networks since multiparty output establishment in this environment requires lightweight communication and computation overhead, further the primary task of SGA in 5G WSNs is to securely transmit various types of jointly computing data. Hence, membership authentication and multiparty output for arithmetic computations become two fundamental and necessary security services in SGA for 5G WSNs. In this paper we propose a novel design of non-interactive integrated membership authenticated multiparty output for arithmetic computations in 5G sensor networks, which embeds the function of membership authentication and multiparty output for arithmetic computations. Since any arithmetic computation function is composed of multiple additions and multiplications, our result serves as a general method for multiparty computation output in SGA. This design is more suitable for lightweight membership authenticated multiparty arithmetic computations output in 5G sensor networks.
摘要:
Software defect prediction (SDP) is a very important way for analyzing software quality and reducing development costs. The data during software lifecycle can be used to predict software defect. Currently, many SDP models have been proposed; however, their performance was not always ideal. In many existing prediction models based on machine learning, the distance metric between samples has significant impact on the performance of the SDP model. In addition, most samples are usually class imbalanced. To solve these issues, in this paper, a novel distance metric learning based on cost-sensitive learning (CSL) is proposed for reducing the impact of class imbalance of samples, which is then applied to the large margin distribution machine (LDM) to substitute the traditional kernel function. Further, the improvement and optimization of LDM based on CSL are also studied, and the improved LDM is used as the SDP model, called as CS-ILDM. Subsequently, the proposed CS-ILDM is applied to five publicly available data sets from the NASA Metrics Data Program repository and its performance is compared to other existing SDP models. The experimental results confirm that the proposed CS-ILDM not only has good prediction performance, but also can reduce the misprediction cost and avoid the impact of class imbalance of samples.
摘要:
Currently, Android unlock pattern is the most widely used graphical unlock pattern. But, as similar to text passwords, user-generated Android graphical passwords usually have lower entropy values. Most of them can be easily guessed by an attacker. In this paper, we mainly study how to change the 3 x 3 layout of Android unlock pattern based on nine points. And, we want to use this method to determine whether it will affect the security and usability of the graphical passwords. We conducted an online study on the university campus to evaluate the security and usability of the original Android unlock pattern and the other four patterns. We evaluated the recall success rate, setup time and recall time for the graphical passwords. Besides, we also established the 3-gram Markov model to evaluate the security of the passwords from the three indicators: beta-success-rate, Minentropy, and partial guessing entropy. Our research found that changing the layout of the original Android unlock pattern based on nine points can improve security to a certain extent, and it would not have much impact on usability.
作者机构:
[Jiang, Xingpeng; He, Tingting; Zhao, Weizhong; Zhao, Yao] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Zhao, Weizhong; Zhao, Yao] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.;[Jiang, Xingpeng; He, Tingting; Zhao, Weizhong; Zhao, Yao] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netw, Wuhan 430079, Hubei, Peoples R China.;[Zhao, Weizhong] Guilin Univ Elect Technol, Guangxi Key Lab Trusted Software, Guilin 541004, Peoples R China.;[Zhao, Weizhong] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China.
通讯机构:
[Weizhong Zhao] H;Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University , Wuhan, Hubei 430079, China<&wdkj&>School of Computer, Central China Normal University, Wuhan, Hubei 430079, China<&wdkj&>National Language Resources Monitoring & Research Center for Network Media, Central China Normal University , Wuhan, Hubei 430079, China<&wdkj&>Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology , Guilin 541004, China<&wdkj&>Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University , Guilin 541004, China
期刊:
Mathematical Problems in Engineering,2021年2021 ISSN:1024-123X
作者机构:
[Cheng, Qi] Wuhan Digital Engn Inst, Prod Engn Dept, Wuhan 430074, Peoples R China.;[Zhang, Maoyuan; Wu, Di; Yang, Qing; Zhao, Zhuo; Hsu, Chingfang] Cent China Normal Univ, Comp Sch, Wuhan 430079, Peoples R China.
摘要:
Although nowadays lots of group key agreement schemes have been presented, most of these protocols generate a secret key for a single group. However, in the IoT HCS, more and more communications are involved in multiple groups and users can join multiple groups to communicate at the same time. Therefore, applying the conventional public-key-based one-at-a-time group key establishment protocols has heavy computational cost or suffer from security vulnerabilities. At the same time, in an IoT HCS, a trusted KGC is usually not available and so more flexible self-organized multigroup keys generation will be desired by all group members. In order to address this issue, a practical scheme for efficient and flexible KGC-free polynomial-based multigroup key establishments for IoT HCS is proposed. The proposed protocol can generate multiple group keys for all group members at once, instead of generating one key each time for a single group; more importantly, there is no need for a trusted KGC in the process of group keys establishment and each user can join multiple groups at the same time using only one reserved share. Meanwhile, the security of the proposed protocol is discussed in detail. Finally, we compare this protocol with the latest related group key distribution protocols in performance analysis. The results show that this efficient and flexible KGC-free polynomial-based multiple group keys establishment protocol is more suitable for practical group key agreement in IoT HCS.
关键词:
Authentication;fog computing;Internet of Things;security
摘要:
Fog computing can effectively provide a variety of application support for the fast-growing number of Internet of Things devices. However, the unique characteristics of fog computing also bring new security problems, especially the identity authentication in fog computing will face new challenges: Low latency (cloud servers should not be involved in authentication); fog servers are not completely trusted; robustness (no user reregistration is required when a fog server leaves fog) and lightweight (fog devices have constrained resources). In order to solve these problems faced by identity authentication in fog computing, we propose an authentication scheme suitable for fog computing environment, which implements mutual authentication between fog users and fog devices with the cooperation of incompletely trusted fog servers. Formal security analysis using the extended real-or-random (ROR) model shows that the proposed scheme is provably secure, and informal security analysis shows that the proposed scheme can resist known attacks. Compared with existing schemes, the proposed scheme supports more functionality features. In addition, a comparative analysis of the communication costs and calculation costs of various schemes shows that our scheme is more suitable for application in fog computing environment than the existing schemes.
作者机构:
[Huang, Xiaoyun; Huo, Ban; He, Tingting; Jiang, Xingpeng; Sun, Han; Fu, Lingling] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Peoples R China.;[Huang, Xiaoyun; Huo, Ban; He, Tingting; Jiang, Xingpeng; Sun, Han; Fu, Lingling] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;[Sun, Han; Fu, Lingling] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Peoples R China.;[Huang, Xiaoyun] Cent China Normal Univ, Collaborat & Innovat Ctr Educ Technol, Wuhan 430079, Peoples R China.;[He, Tingting; Jiang, Xingpeng] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netw, Wuhan 430079, Peoples R China.
通讯机构:
[Jiang, Xingpeng] C;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netw, Wuhan 430079, Peoples R China.
摘要:
The dysbiosis of microbiome may have negative effects on a host phenotype. The microbes related to the host phenotype are regarded as microbial association signals. Recently, statistical methods based on microbiome-phenotype association tests have been extensively developed to detect these association signals. However, the currently available methods do not perform well to detect microbial association signals when dealing with diverse sparsity levels (i.e., sparse, low sparse, non-sparse). Actually, the real association patterns related to different host phenotypes are not unique. Here, we propose a powerful and adaptive microbiome-based association test to detect microbial association signals with diverse sparsity levels, designated as MiATDS. In particular, we define probability degree to measure the associations between microbes and the host phenotype and introduce the adaptive weighted sum of powered score tests by considering both probability degree and phylogenetic information. We design numerous simulation experiments for the task of detecting association signals with diverse sparsity levels to prove the performance of the method. We find that type I error rates can be well-controlled and MiATDS shows superior efficiency on the power. By applying to real data analysis, MiATDS displays reliable practicability too. Copyright (C) 2021, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Limited and Science Press. All rights reserved.
作者机构:
[Xie, Bo; Liu, Fei; Ge, Leixin; Zhou, Dan; Shi, Zunji; Zhao, Na; Wu, Gang; Zheng, Ningning] Cent China Normal Univ, Sch Life Sci, Hubei Key Lab Genet Regulat & Integrat Biol, Wuhan 430079, Peoples R China.;[Zhou, Dan] Guizhou Normal Coll, Sch Biol Sci, Guiyang 550018, Guizhou, Peoples R China.;[Jiang, Xingpeng] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.;[Halverson, Larry] Iowa State Univ, Dept Plant Pathol & Microbiol, Ames, IA USA.
通讯机构:
[Xie, Bo] C;Cent China Normal Univ, Sch Life Sci, Hubei Key Lab Genet Regulat & Integrat Biol, Wuhan 430079, Peoples R China.
摘要:
Microbial taxon-taxon co-occurrences may directly or indirectly reflect the potential relationships between the members within a microbial community. However, to what extent and the specificity by which these co-occurrences are influenced by environmental factors remains unclear. In this report, we evaluated how the dynamics of microbial taxon-taxon co-occurrence is associated with the changes of environmental factors in Nan Lake at Wuhan city, China with a Modified Liquid Association method. We were able to detect more than one thousand taxon-taxon co-occurrences highly correlated with one or more environmental factors across a phytoplankton bloom using 16S rRNA gene amplicon community profiles. These co-occurrences, referred to as environment dependent co-occurrences (ED_co-occurrences), delineate a unique network in which a taxon-taxon pair exhibits specific, and potentially dynamic correlations with an environmental parameter, while the individual relative abundance of each may not. Microcystis involved ED_co-occurrences are in important topological positions in the network, suggesting relationships between the bloom dominant species and other taxa could play a role in the interplay of microbial community and environment across various bloom stages. Our results may broaden our understanding of the response of a microbial community to the environment, particularly at the level of microbe-microbe associations. This article is protected by copyright. All rights reserved.
期刊:
Journal of Circuits, Systems and Computers,2021年30(06):2150097:1-2150097:28 ISSN:0218-1266
作者机构:
[Zhang, Yuzuo; Zheng, Shijue] Cent China Normal Univ, Dept Comp Sci & Technol, Wuhan 430079, Peoples R China.;[Zhang, Xinyan; Li, Yuanhao] Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan 430074, Peoples R China.
关键词:
Neural network models;model integration;NOx emissions;power station boiler
摘要:
In the coal-fired power generation system, it is necessary to predict the NOx emissions of power station boilers when it comes to the step to spray ammonia to ensure that NOx emissions do not exceed national standards. Using traditional machine learning algorithms in the modeling of power station boilers will require features selection and steady-state extraction, which is not suitable for practical applications. In order to reduce the NOx prediction error rate under variable operating conditions, a multi-model fusion algorithm S3LX combined with linear regression, XGBoost, and long-short-term memory recurrent neural network is proposed to model the NOx emission prediction of power station boilers. The preprocessing data scheme suitable for power station boiler data sets is proposed and implemented in this paper, which can perform numerical processing, data cleaning and data standardization for boiler's data and features. A 7-day historical operating data set of a unit in Guangzhou Shajiao C Power Plant was used as the training set and test set and was used to build the NOx emission prediction model after data preprocessing. Results show that compared with traditional machine learning algorithms, S3LX has good prediction ability under varying conditions with an average error of 4.28%. Compared with the average prediction error of the multi-layer perceptron 9.16%, SVM 7.37%, S3LX makes the error significantly reduced and satisfies the actual engineering demand.