SmartClean: an incremental data cleaning tool
Data(s) |
15/05/2013
15/05/2013
2009
|
---|---|
Resumo |
This paper presents the SmartClean tool. The purpose of this tool is to detect and correct the data quality problems (DQPs). Compared with existing tools, SmartClean has the following main advantage: the user does not need to specify the execution sequence of the data cleaning operations. For that, an execution sequence was developed. The problems are manipulated (i.e., detected and corrected) following that sequence. The sequence also supports the incremental execution of the operations. In this paper, the underlying architecture of the tool is presented and its components are described in detail. The tool's validity and, consequently, of the architecture is demonstrated through the presentation of a case study. Although SmartClean has cleaning capabilities in all other levels, in this paper are only described those related with the attribute value level. |
Identificador |
DOI 10.1109/QSIC.2009.67 978-1-4244-5912-4 1550-6002 |
Idioma(s) |
eng |
Publicador |
IEEE |
Relação |
Quality Software http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5381543 |
Direitos |
closedAccess |
Palavras-Chave | #Limpeza de dados #Problemas de qualidade de dados #Data cleaning #Detection #Correction #Data quality problems #Architecture #Tool |
Tipo |
conferenceObject |