基于质子串分解的中文术语自动抽取

首页 > 成果 > 详情

认领

导出

Link by 中国知网学术期刊 Link by 万方学术期刊

反馈

作者信息关键词期刊信息基础信息归属信息摘要

成果类型：

期刊论文

作者：

何婷婷（何婷婷）;张勇

通讯作者：

He, T.(tthe@mail.ccnu.edu.cn)

作者机构：

[何婷婷; 张勇] School of Software, Tsinghua University, Beijing 100084, China

National Language Resources Monitor and Research Center (Network Media), Wuhan 430079, China

Department of Computer Science, Huazhong Normal University, Wuhan 430079, China

通讯机构：

School of Software, Tsinghua University, China

语种：

中文

关键词：

质子串分解;术语自动抽取;互信息

关键词(英文)：

C-value

期刊：

计算机工程

ISSN：

1000-3428

年：

2006

卷：

期：

页码：

188-190

DOI：

10.3969/j.issn.1000-3428.2006.23.067

基金类别：

国家自然科学基金资助项目（60442005）国家“973”计划基金资助项目（2004CB318104）教育部科学技术研究基金资助重点项目（105117）国家语委语言文字应用“十五”规划基金资助重点项目（ZDI105-B01）

机构署名：

本校为其他机构

院系归属：

计算机学院

摘要：

针对中文术语构成特点,提出了一种基于质子串分解的术语自动抽取方法,将词分为2类:结构简单的质词和有复杂结构的合词;使用参数F-MI抽取结构简单的质词;并在其基础上,进一步使用质子串分解方法抽取具有复杂结构的合词.实验结果显示,该算法有效地提高了中文自动术语抽取的精确度.目前该算法已在国家网络媒体监测项目中得到了应用,并显示了良好的效果.

摘要(英文)：

In view of Chinese word characteristic, this paper proposes an ATE algorithm, which is based on the decomposition of prime string. Word can be classified to two groups: prime words with simple structure and combined words with complex structure. Prime words are extracted using the F-MI parameter, and combined words are extracted by the decomposition of prime string. Experiments show the algorithm can effectively improve the precision in Chinese ATE. Now this method has been applied to the project of National Language Resources Monitor a...

反馈

产权有误：本人成果被他人认领

数据有误：数据基本信息有误

归属有误：成果的院系归属、机构署名归属有误

其他原因：

验证码：

看不清楚，换一个

确定

取消

成果认领

标题：

用户	作者	通讯作者	--
	请选择	请选择	--

确定

取消

基于质子串分解的中文术语自动抽取

反馈

成果认领

提示

该栏目需要登录且有访问权限才可以访问