Extracting Query Interfaces Based on Form Structures and Semantic Similarity


Autoria(s): Hong, Jun; He, Zhongtian; Bell, David A.
Data(s)

01/04/2009

Resumo

Web databases are now pervasive. Such a database can be accessed via its query interface (usually HTML query form) only. Extracting Web query interfaces is a critical step in data integration across multiple Web databases, which creates a formal representation of a query form by extracting a set of query conditions in it. This paper presents a novel approach to extracting Web query interfaces. In this approach, a generic set of query condition rules are created to define query conditions that are semantically equivalent to SQL search conditions. Query condition rules represent the semantic roles that labels and form elements play in query conditions, and how they are hierarchically grouped into constructs of query conditions. To group labels and form elements in a query form, we explore both their structural proximity in the hierarchy of structures in the query form, which is captured by a tree of nested tags in the HTML codes of the form, and their semantic similarity, which is captured by various short texts used in labels, form elements and their properties. We have implemented the proposed approach and our experimental results show that the approach is highly effective.

Identificador

http://pure.qub.ac.uk/portal/en/publications/extracting-query-interfaces-based-on-form-structures-and-semantic-similarity(5a7c27ba-45d7-46f4-a456-1ff699ea2d2e).html

http://dx.doi.org/10.1109/ICDE.2009.215

Idioma(s)

eng

Publicador

Institute of Electrical and Electronics Engineers (IEEE)

Direitos

info:eu-repo/semantics/restrictedAccess

Fonte

Hong , J , He , Z & Bell , D A 2009 , Extracting Query Interfaces Based on Form Structures and Semantic Similarity . in 2009 IEEE 25th International Conference on Data Engineering : (ICDE 2009) . Institute of Electrical and Electronics Engineers (IEEE) , pp. 1259-1262 , ICDE 2009 25th International Conference on Data Engineering , Shanghai , China , 29-2 April . DOI: 10.1109/ICDE.2009.215

Palavras-Chave #/dk/atira/pure/subjectarea/asjc/1700/1710 #Information Systems #/dk/atira/pure/subjectarea/asjc/1700/1711 #Signal Processing #/dk/atira/pure/subjectarea/asjc/1700/1712 #Software
Tipo

contributionToPeriodical