Discovering interesting information with advances in web technology


Autoria(s): Nayak, Richi; Senellart, Pierre; Suchanek, Fabian M; Varde, Aparna S
Data(s)

01/12/2012

Resumo

The Web is a steadily evolving resource comprising much more than mere HTML pages. With its ever-growing data sources in a variety of formats, it provides great potential for knowledge discovery. In this article, we shed light on some interesting phenomena of the Web: the deep Web, which surfaces database records as Web pages; the Semantic Web, which de�nes meaningful data exchange formats; XML, which has established itself as a lingua franca for Web data exchange; and domain-speci�c markup languages, which are designed based on XML syntax with the goal of preserving semantics in targeted domains. We detail these four developments in Web technology, and explain how they can be used for data mining. Our goal is to show that all these areas can be as useful for knowledge discovery as the HTML-based part of the Web.

Identificador

http://eprints.qut.edu.au/62117/

Publicador

Association for Computing Machinery, Inc.

Relação

DOI:10.1145/2481244.2481255

Nayak, Richi, Senellart, Pierre, Suchanek, Fabian M, & Varde, Aparna S (2012) Discovering interesting information with advances in web technology. SIGKDD Explorations, 14(2), pp. 63-81.

Fonte

School of Electrical Engineering & Computer Science; Science & Engineering Faculty

Palavras-Chave #Web technology #Deep web #Semantic web #Data exchange formats #XML
Tipo

Journal Article