SmartClean: an incremental data cleaning tool


Autoria(s): Oliveira, Paulo; Rodrigues, Fátima; Henriques, Pedro
Data(s)

15/05/2013

15/05/2013

2009

Resumo

This paper presents the SmartClean tool. The purpose of this tool is to detect and correct the data quality problems (DQPs). Compared with existing tools, SmartClean has the following main advantage: the user does not need to specify the execution sequence of the data cleaning operations. For that, an execution sequence was developed. The problems are manipulated (i.e., detected and corrected) following that sequence. The sequence also supports the incremental execution of the operations. In this paper, the underlying architecture of the tool is presented and its components are described in detail. The tool's validity and, consequently, of the architecture is demonstrated through the presentation of a case study. Although SmartClean has cleaning capabilities in all other levels, in this paper are only described those related with the attribute value level.

Identificador

DOI 10.1109/QSIC.2009.67

978-1-4244-5912-4

1550-6002

http://hdl.handle.net/10400.22/1583

Idioma(s)

eng

Publicador

IEEE

Relação

Quality Software

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5381543

Direitos

closedAccess

Palavras-Chave #Limpeza de dados #Problemas de qualidade de dados #Data cleaning #Detection #Correction #Data quality problems #Architecture #Tool
Tipo

conferenceObject