79 resultados para WordNet


Relevância:

10.00% 10.00%

Publicador:

Resumo:

506 p., 44 p.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

在基于映射的数据交换系统框架下,提出了一种本体辅助的模式匹配方法.它利用WordNet词汇本体和决策树学习相结合的方法进行属性名称匹配,构建数据类型本体计算属性数据类型的语义距离,依赖领域本体发现一对多的语义匹配关系,这3个过程逐步提高了匹配质量.建立在实际应用数据上的实验结果表明,该方法具有较高的精确度和召回率.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

随着网络技术的快速发展、企业信息化的不断深入,企业中分布的数据、信息和知识更加多样,更加复杂,企业信息系统更加开放。如何实现企业中这些数据、信息和知识集成和共享已成为关键性问题。数据集成技术正是针对这种需求,实现分布、异构、复杂数据、信息和知识的动态、灵活、实时的集成和共享。 OnceDI 2.0很好的解决了在数据级别上异构数据源的互操作问题,满足不同的数据集成需求,跨平台,跨多种数据源,具有增量传输,冲突解决等多种实用机制,并提供完善的安全和管理工具。然而,它也存在缺陷,包括:接收数据源只能根据接收到的数据块定义,这时已经完成了数据的发送过程;发送数据源和接收数据源的字段对应关系必须完全由人工构建等问题。 数据集成的目标是为用户访问多个分布的、独立的、异构的数据源提供统一的应用界面。在ETL(Extract-Transform-Load,即数据抽取、转换和加载)过程可视化配置中,包含如何让用户更好地理解ETL过程以及如何让用户更有效地、更容易地配置、管理和执行ETL过程等问题。 论文在研究数据集成过程特点基础上,围绕数据集成中的可视化ETL过程的问题,确立了本文关于数据集成中数据转换和数据过滤的研究方向。针对数据转换,论文从模式匹配和实例转换两方面入手。在模式匹配方面,论文提出一种本体辅助的自动化模式匹配算法,它包括三部分:决策树学习和WordNet词汇本体相结合的方法计算属性名称匹配,定义属性数据类型本体解决带数据类型的属性匹配以及利用领域本体构建属性间的非直接映射关系解决一对多的语义匹配。该方法使得数据转换的可视化过程操作更加简便,自动化匹配结果更令用户满意。在实例转换方面,论文提出一种实例转换工具的设计方案,界面更加友好,更重要地,使得用户对实例级别的转换操作更加清晰、简单。针对数据过滤,论文从数据质量控制条件设置的特点入手,提出一种数据质量控制条件设置工具设计方案。 最后,本文针对OnceDI 3.0中的数据集成模型和OnceDI 3.0客户端-控制中心-DI服务器的三层体系结构设计实现数据集成中的可视化ETL工具,在设计中通过设计模式的应用增强了系统的可扩展性。

Relevância:

10.00% 10.00%

Publicador:

Resumo:

领域构件的接口名称不仅仅只是一个使其唯一的标识符,其中往往蕴含了相关领域中的语义信息。本文假设构件的设计者在对构件接口进行命名时,会尽量包含接口在领域中的语义信息,这些信息包括:构件接口运行的环境,构件接口操作的领域对象,以及构件接口对领域对象所进行的操作。本文将试图从接口的名称中发掘出这些信息,并以此作为依据来对用户的查询请求进行基于语义的匹配度计算。本文将本体论的观点用在知识表达上,即借由本体论中的基本元素:概念及概念间的关联,作为描述真实世界的知识模型。描述了如何运用本体来表现领域中一组基本概念以及这些概念之间的关系,以及利用这些术语和关系构成的规定领域本体外延的规则的定义。本文利用经过改进的WordNet模型来描述领域本体,通过对保险领域进行分析得到描述保险领域基本语义的原始语料库,并利用原始语料库对构件接口进行解析,将解析得到的语义信息记录在解析构件库。对于用户的查询请求,也采用基于语义的描述,将其和解析构件库中的接口名称包含的语义进行相似度计算,并返回给查询用户。本文将上述方法在北京市科委的项目中进行了实践,建立了保险领域本体的参考模型,以保险核心业务系统的构件库作为实验对象,设计了一套基于语义的接口匹配系统,取得了较好的效果。

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Se analizan y describen las principales líneas de trabajo de la Web Semántica en el ámbito de los archivos de televisión. Para ello, se analiza y contextualiza la web semántica desde una perspectiva general para posteriormente analizar las principales iniciativas que trabajan con lo audiovisual: Proyecto MuNCH, Proyecto S5T, Semantic Television y VideoActive.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Most studies of conceptual knowledge in the brain focus on a narrow range of concrete conceptual categories, rely on the researchers' intuitions about which object belongs to these categories, and assume a broadly taxonomic organization of knowledge. In this fMRI study, we focus on concepts with a variety of concreteness levels; we use a state of the art lexical resource (WordNet 3.1) as the source for a relatively large number of category distinctions and compare a taxonomic style of organization with a domain-based model (associating concepts with scenarios). Participants mentally simulated situations associated with concepts when cued by text stimuli. Using multivariate pattern analysis, we find evidence that all Taxonomic categories and Domains can be distinguished from fMRI data and also observe a clear concreteness effect: Tools and Locations can be reliably predicted for unseen participants, but less concrete categories (e.g., Attributes, Communications, Events, Social Roles) can only be reliably discriminated within participants. A second concreteness effect relates to the interaction of Domain and Taxonomic category membership: Domain (e.g., relation to Law vs. Music) can be better predicted for less concrete categories. We repeated the analysis within anatomical regions, observing discrimination between all/most categories in the left middle occipital and temporal gyri, and more specialized discrimination for concrete categories Tool and Location in the left precentral and fusiform gyri, respectively. Highly concrete/abstract Taxonomic categories and Domain were segregated in frontal regions. We conclude that both Taxonomic and Domain class distinctions are relevant for interpreting neural structuring of concrete and abstract concepts.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Distributional semantics tries to characterize the meaning of words by the contexts in which they occur. Similarity of words hence can be derived from the similarity of contexts. Contexts of a word are usually vectors of words appearing near to that word in a corpus. It was observed in previous research that similarity measures for the context vectors of two words depend on the frequency of these words. In the present paper we investigate this dependency in more detail for one similarity measure, the Jensen-Shannon divergence. We give an empirical model of this dependency and propose the deviation of the observed Jensen-Shannon divergence from the divergence expected on the basis of the frequencies of the words as an alternative similarity measure. We show that this new similarity measure is superior to both the Jensen-Shannon divergence and the cosine similarity in a task, in which pairs of words, taken from Wordnet, have to be classified as being synonyms or not.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this thesis we are going to analyze the dictionary graphs and some other kinds of graphs using the PagerRank algorithm. We calculated the correlation between the degree and PageRank of all nodes for a graph obtained from Merriam-Webster dictionary, a French dictionary and WordNet hypernym and synonym dictionaries. Our conclusion was that PageRank can be a good tool to compare the quality of dictionaries. We studied some artificial social and random graphs. We found that when we omitted some random nodes from each of the graphs, we have not noticed any significant changes in the ranking of the nodes according to their PageRank. We also discovered that some social graphs selected for our study were less resistant to the changes of PageRank.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper compares statistical technique of paraphrase identification to semantic technique of paraphrase identification. The statistical techniques used for comparison are word set and word-order based methods where as the semantic technique used is the WordNet similarity matrix method described by Stevenson and Fernando in [3].

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this report, we investigate the relationship between the semantic and syntactic properties of verbs. Our work is based on the English Verb Classes and Alternations of (Levin, 1993). We explore how these classes are manifested in other languages, in particular, in Bangla, German, and Korean. Our report includes a survey and classification of several hundred verbs from these languages into the cross-linguistic equivalents of Levin's classes. We also explore ways in which our findings may be used to enhance WordNet in two ways: making the English syntactic information of WordNet more fine-grained, and making WordNet multilingual.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a hierarchical clustering method for semantic Web service discovery. This method aims to improve the accuracy and efficiency of the traditional service discovery using vector space model. The Web service is converted into a standard vector format through the Web service description document. With the help of WordNet, a semantic analysis is conducted to reduce the dimension of the term vector and to make semantic expansion to meet the user’s service request. The process and algorithm of hierarchical clustering based semantic Web service discovery is discussed. Validation is carried out on the dataset.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The central problem of automatic retrieval from unformatted text is that computational devices are not adequately trained to look for associated information. However for complete understanding and information retrieval, a complete artificial intelligence would have to be built. This paper describes a method for achieving significant information retrieval by using a semantic search engine. The underlying semantic information is stored in a network of clarified words, linked by logical connections. We employ simple scoring techniques on collections of paths in this network to establish a degree of relevance between a document and a clarified search criterion. This technique has been applied with success to test examples and can be easily scaled up to search large documents.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Continuous content management of health information portals is a feature vital for its sustainability and widespread acceptance. Knowledge and experience of a domain expert is essential for content management in the health domain. The rate of generation of online health resources is exponential and thereby manual examination for relevance to a specific topic and audience is a formidable challenge for domain experts. Intelligent content discovery for effective content management is a less researched topic. An existing expert-endorsed content repository can provide the necessary leverage to automatically identify relevant resources and evaluate qualitative metrics.Objective: This paper reports on the design research towards an intelligent technique for automated content discovery and ranking for health information portals. The proposed technique aims to improve efficiency of the current mostly manual process of portal content management by utilising an existing expert-endorsed content repository as a supporting base and a benchmark to evaluate the suitability of new contentMethods: A model for content management was established based on a field study of potential users. The proposed technique is integral to this content management model and executes in several phases (ie, query construction, content search, text analytics and fuzzy multi-criteria ranking). The construction of multi-dimensional search queries with input from Wordnet, the use of multi-word and single-word terms as representative semantics for text analytics and the use of fuzzy multi-criteria ranking for subjective evaluation of quality metrics are original contributions reported in this paper.Results: The feasibility of the proposed technique was examined with experiments conducted on an actual health information portal, the BCKOnline portal. Both intermediary and final results generated by the technique are presented in the paper and these help to establish benefits of the technique and its contribution towards effective content management.Conclusions: The prevalence of large numbers of online health resources is a key obstacle for domain experts involved in content management of health information portals and websites. The proposed technique has proven successful at search and identification of resources and the measurement of their relevance. It can be used to support the domain expert in content management and thereby ensure the health portal is up-to-date and current.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A key to maintain Enterprises competitiveness is the ability to describe, standardize, and adapt the way it reacts to certain types of business events, and how it interacts with suppliers, partners, competitors, and customers. In this context the field of organization modeling has emerged with the aim to create models that help to create a state of self-awareness in the organization. This project's context is the use of Semantic Web in the Organizational modeling area. The Semantic Web technology advantages can be used to improve the way of modeling organizations. This was accomplished using a Semantic wiki to model organizations. Our research and implementation had two main purposes: formalization of textual content in semantic wiki pages; and automatic generation of diagrams from organization data stored in the semantic wiki pages.