Task clustering on ETL systems – A pattern-oriented approach
Data(s) |
20/07/2015
|
---|---|
Resumo |
Usually, data warehousing populating processes are data-oriented workflows composed by dozens of granular tasks that are responsible for the integration of data coming from different data sources. Specific subset of these tasks can be grouped on a collection together with their relationships in order to form higher- level constructs. Increasing task granularity allows for the generalization of processes, simplifying their views and providing methods to carry out expertise to new applications. Well-proven practices can be used to describe general solutions that use basic skeletons configured and instantiated according to a set of specific integration requirements. Patterns can be applied to ETL processes aiming to simplify not only a possible conceptual representation but also to reduce the gap that often exists between two design perspectives. In this paper, we demonstrate the feasibility and effectiveness of an ETL pattern-based approach using task clustering, analyzing a real world ETL scenario through the definitions of two commonly used clusters of tasks: a data lookup cluster and a data conciliation and integration cluster. |
Identificador | |
Idioma(s) |
por |
Direitos |
info:eu-repo/semantics/restrictedAccess |
Palavras-Chave | #Data Warehousing Systems #ETL Conceptual Modelling #Task Clustering #ETL Patterns #ETL Skeletons #BPMN #Kettle |
Tipo |
info:eu-repo/semantics/conferenceObject |