30 resultados para Data Warehousing Systems

em Universidade do Minho


Relevância:

100.00% 100.00%

Publicador:

Resumo:

During the last few years many research efforts have been done to improve the design of ETL (Extract-Transform-Load) systems. ETL systems are considered very time-consuming, error-prone and complex involving several participants from different knowledge domains. ETL processes are one of the most important components of a data warehousing system that are strongly influenced by the complexity of business requirements, their changing and evolution. These aspects influence not only the structure of a data warehouse but also the structures of the data sources involved with. To minimize the negative impact of such variables, we propose the use of ETL patterns to build specific ETL packages. In this paper, we formalize this approach using BPMN (Business Process Modelling Language) for modelling more conceptual ETL workflows, mapping them to real execution primitives through the use of a domain-specific language that allows for the generation of specific instances that can be executed in an ETL commercial tool.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Usually, data warehousing populating processes are data-oriented workflows composed by dozens of granular tasks that are responsible for the integration of data coming from different data sources. Specific subset of these tasks can be grouped on a collection together with their relationships in order to form higher- level constructs. Increasing task granularity allows for the generalization of processes, simplifying their views and providing methods to carry out expertise to new applications. Well-proven practices can be used to describe general solutions that use basic skeletons configured and instantiated according to a set of specific integration requirements. Patterns can be applied to ETL processes aiming to simplify not only a possible conceptual representation but also to reduce the gap that often exists between two design perspectives. In this paper, we demonstrate the feasibility and effectiveness of an ETL pattern-based approach using task clustering, analyzing a real world ETL scenario through the definitions of two commonly used clusters of tasks: a data lookup cluster and a data conciliation and integration cluster.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ETL conceptual modeling is a very important activity in any data warehousing system project implementation. Owning a high-level system representation allowing for a clear identification of the main parts of a data warehousing system is clearly a great advantage, especially in early stages of design and development. However, the effort to model conceptually an ETL system rarely is properly rewarded. Translating ETL conceptual models directly into something that saves work and time on the concrete implementation of the system process it would be, in fact, a great help. In this paper we present and discuss a hybrid approach to this problem, combining the simplicity of interpretation and power of expression of BPMN on ETL systems conceptualization with the use of ETL patterns to produce automatically an ETL skeleton, a first prototype system, which has the ability to be executed in a commercial ETL tool like Kettle.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação de mestrado em Systems Engineering

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Today it is easy to find a lot of tools to define data migration schemas among different types of information systems. Data migration processes use to be implemented on a very diverse range of applications, ranging from conventional operational systems to data warehousing platforms. The implementation of a data migration process often involves a serious planning, considering the development of conceptual migration schemas at early stages. Such schemas help architects and engineers to plan and discuss the most adequate way to migrate data between two different systems. In this paper we present and discuss a way for enriching data migration conceptual schemas in BPMN using a domain-specific language, demonstrating how to convert such enriched schemas to a first correspondent physical representation (a skeleton) in a conventional ETL implementation tool like Kettle.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Today recovering urban waste requires effective management services, which usually imply sophisticated monitoring and analysis mechanisms. This is essential for the smooth running of the entire recycling process as well as for planning and control urban waste recovering. In this paper we present a business intelligence system especially designed and im- plemented to support regular decision-making tasks on urban waste management processes. The system provides a set of domain-oriented analytical tools for studying and characterizing poten- tial scenarios of collection processes of urban waste, as well as for supporting waste manage- ment in urban areas, allowing for the organization and optimization of collection services. In or- der to clarify the way the system was developed and the how it operates, particularly in process visualization and data analysis, we also present the organization model of the system, the ser- vices it disposes, and the interface platforms for exploring data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modeling Extract-Transform-Load (ETL) processes of a Data Warehousing System has always been a challenge. The heterogeneity of the sources, the quality of the data obtained and the conciliation process are some of the issues that must be addressed in the design phase of this critical component. Commercial ETL tools often provide proprietary diagrammatic components and modeling languages that are not standard, thus not providing the ideal separation between a modeling platform and an execution platform. This separation in conjunction with the use of standard notations and languages is critical in a system that tends to evolve through time and which cannot be undermined by a normally expensive tool that becomes an unsatisfactory component. In this paper we demonstrate the application of Relational Algebra as a modeling language of an ETL system as an effort to standardize operations and provide a basis for uncommon ETL execution platforms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The MAP-i Doctoral Programme in Informatics, of the Universities of Minho, Aveiro and Porto

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a mobile information system denominated as Vehicle-to-Anything Application (V2Anything App), and explains its conceptual aspects. This application is aimed at giving relevant information to Full Electric Vehicle (FEV) drivers, by supporting the integration of several sources of data in a mobile application, thus contributing to the deployment of the electric mobility process. The V2Anything App provides recommendations to the drivers about the FEV range autonomy, location of battery charging stations, information of the electricity market, and also a route planner taking into account public transportations and car or bike sharing systems. The main contributions of this application are related with the creation of an Information and Communication Technology (ICT) platform, recommender systems, data integration systems, driver profile, and personalized range prediction. Thus, it is possible to deliver relevant information to the FEV drivers related with the electric mobility process, electricity market, public transportation, and the FEV performance.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Publicado em "Information control in manufacturing 1998 : (INCOM'98) : advances in industrial engineering : a proceedings volume from the 9th IFAC Symposium, Nancy-Metz, France, 24-26 June 1998. Vol. 2"

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The development of organic materials displaying high two-photon absorption (TPA) has attracted much attention in recent years due to a variety of potential applications in photonics and optoelectronics, such as three-dimensional optical data storage, fluorescence imaging, two-photon microscopy, optical limiting, microfabrication, photodynamic therapy, upconverted lasing, etc. The most frequently employed structural motifs for TPA materials are donor–pi bridge–acceptor (D–pi–A) dipoles, donor–pi bridge–donor (D–pi–D) and acceptor–pi bridge-acceptor (A–pi–A) quadrupoles, octupoles, etc. In this work we present the synthesis and photophysical characterization of quadrupolar heterocyclic systems with potential applications in materials and biological sciences as TPA chromophores. Indole is a versatile building block for the synthesis of heterocyclic systems for several optoelectronic applications (chemosensors, nonlinear optical, OLEDs) due to its photophysical properties and donor electron ability and 4H-pyran-4-ylidene fragment is frequently used for the synthesis of red light-emitting materials. On the other hand, 2-(2,6-dimethyl-4H-pyran-4-ylidene)malononitrile (1) and 1,3-diethyl-dihydro-5-(2,6-dimethyl-4H-pyran-4-ylidene)-2-thiobarbituric (2) units are usually used as strong acceptor moieties for the preparation of π-conjugated systems of the push-pull type. These building blocks were prepared by Knoevenagel condensation of the corresponding ketone precursor with malononitrile or 1,3-diethyl-dihydro-2-thiobarbituric acid. The new quadrupolar 4H-pyran-4-ylidene fluorophores (3) derived from indole were prepared through condensation of 5-methyl-1H-indole-3-carbaldehyde with the acceptor precursors 1 and 2, in the presence of a catalytical amount of piperidine. The new compounds were characterized by the usual spectroscopic techniques (UV-vis., FT-IR and multinuclear NMR - 1H, 13C).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hospitals are nowadays collecting vast amounts of data related with patient records. All this data hold valuable knowledge that can be used to improve hospital decision making. Data mining techniques aim precisely at the extraction of useful knowledge from raw data. This work describes an implementation of a medical data mining project approach based on the CRISP-DM methodology. Recent real-world data, from 2000 to 2013, were collected from a Portuguese hospital and related with inpatient hospitalization. The goal was to predict generic hospital Length Of Stay based on indicators that are commonly available at the hospitalization process (e.g., gender, age, episode type, medical specialty). At the data preparation stage, the data were cleaned and variables were selected and transformed, leading to 14 inputs. Next, at the modeling stage, a regression approach was adopted, where six learning methods were compared: Average Prediction, Multiple Regression, Decision Tree, Artificial Neural Network ensemble, Support Vector Machine and Random Forest. The best learning model was obtained by the Random Forest method, which presents a high quality coefficient of determination value (0.81). This model was then opened by using a sensitivity analysis procedure that revealed three influential input attributes: the hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such extracted knowledge confirmed that the obtained predictive model is credible and with potential value for supporting decisions of hospital managers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The acoustic emission (AE) technique is used for investigating the interfacial fracture and damage propagation in GFRP-and SRG-strengthened bricks during debonding tests. The bond behavior is investigated through single-lap shear bond tests and the fracture progress during the tests is recorded by means of AE sensors. The fracture progress and active debonding mechanisms are characterized in both specimen types with the aim of AE outputs. Moreover, a clear distinction between the AE outputs of specimens with different failure modes, in both SRG-and GFRP-strengthened specimens, is found which allows characterizing the debonding failure mode based on acoustic emission data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is a difficult task to avoid the “smart systems” topic when discussing smart prevention and, similarly, it is a difficult task to address smart systems without focusing their ability to learn. Following the same line of thought, in the current reality, it seems a Herculean task (or an irreparable omission) to approach the topic of certified occupational health and safety management systems (OHSMS) without discussing the integrated management systems (IMSs). The available data suggest that seldom are the OHSMS operating as the single management system (MS) in a company so, any statement concerning OHSMS should mainly be interpreted from an integrated perspective. A major distinction between generic systems can be drawn between those that learn, i.e., those systems that have “memory” and those that have not. These former systems are often depicted as adaptive since they take into account past events to deal with novel, similar and future events modifying their structure to enable success in its environment. Often, these systems, present a nonlinear behavior and a huge uncertainty related to the forecasting of some events. This paper seeks to portray, for the first time as we were able to find out, the IMSs as complex adaptive systems (CASs) by listing their properties and dissecting the features that enable them to evolve and self-organize in order to, holistically, fulfil the requirements from different stakeholders and thus thrive by assuring the successful sustainability of a company. Based on the revision of literature carried out, this is the first time that IMSs are pointed out as CASs which may develop fruitful synergies both for the MSs and for CASs communities. By performing a thorough revision of literature and based on some concepts embedded in the “DNA” of the subsystems implementation standards it is intended, specifically, to identify, determine and discuss the properties of a generic IMS that should be considered to classify it as a CAS.