963 resultados para integrating data
Resumo:
Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. Most existing systems concentrate either on mining algorithms or on visualization techniques. Though visual methods developed in information visualization have been helpful, for improved understanding of a complex large high-dimensional dataset, there is a need for an effective projection of such a dataset onto a lower-dimension (2D or 3D) manifold. This paper introduces a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualization domain. The framework follows Shneiderman’s mantra to provide an effective user interface. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection methods, such as Generative Topographic Mapping (GTM) and Hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, billboarding, and user interaction facilities, to provide an integrated visual data mining framework. Results on a real life high-dimensional dataset from the chemoinformatics domain are also reported and discussed. Projection results of GTM are analytically compared with the projection results from other traditional projection methods, and it is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework.
Resumo:
Data integration systems offer uniform access to a set of autonomous and heterogeneous data sources. One of the main challenges in data integration is reconciling semantic differences among data sources. Approaches that been used to solve this problem can be categorized as schema-based and attribute-based. Schema-based approaches use schema information to identify the semantic similarity in data; furthermore, they focus on reconciling types before reconciling attributes. In contrast, attribute-based approaches use statistical and structural information of attributes to identify the semantic similarity of data in different sources. This research examines an approach to semantic reconciliation based on integrating properties expressed at different levels of abstraction or granularity using the concept of property precedence. Property precedence reconciles the meaning of attributes by identifying similarities between attributes based on what these attributes represent in the real world. In order to use property precedence for semantic integration, we need to identify the precedence of attributes within and across data sources. The goal of this research is to develop and evaluate a method and algorithms that will identify precedence relations among attributes and build property precedence graph (PPG) that can be used to support integration.
Resumo:
Advances in communication, navigation and imaging technologies are expected to fundamentally change methods currently used to collect data. Electronic data interchange strategies will also minimize data handling and automatically update files at the point of capture. This report summarizes the outcome of using a multi-camera platform as a method to collect roadway inventory data. It defines basic system requirements as expressed by users, who applied these techniques and examines how the application of the technology met those needs. A sign inventory case study was used to determine the advantages of creating and maintaining the database and provides the capability to monitor performance criteria for a Safety Management System. The project identified at least 75 percent of the data elements needed for a sign inventory can be gathered by viewing a high resolution image.
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
This paper presents a low-cost scaled model of a silo for drying and airing cereal grains. It allows the control and monitor of several parameters associated to the silo's operation, through a remote accessible infrastructure. The scaled model consists of a 2.50 m wide × 2.10 m long plant with all control and monitor capacities provided by micro-Web servers. An application running on the micro-Web servers enables storing all parameters in a data basis for later analysis. The implemented model aims to support a remote experimentation facility for technological education, research-oriented tutorials, and industrial applications. Given the low-cost requirement, this remote facility can be easily replicated in other institutions to support a network of remote labs, which encompasses the concurrent access of several users (e.g. students).
Resumo:
Dissertação apresentada para obtenção de Grau de Doutor em Bioquímica,Bioquímica Estrutural, pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia
Resumo:
The latest medical diagnosis devices enable the performance of e-diagnosis making the access to these services easier, faster and available in remote areas. However this imposes new communications and data interchange challenges. In this paper a new XML based format for storing cardiac signals and related information is presented. The proposed structure encompasses data acquisition devices, patient information, data description, pathological diagnosis and waveform annotation. When compared with similar purpose formats several advantages arise. Besides the full integrated data model it may also be noted the available geographical references for e-diagnosis, the multi stream data description, the ability to handle several simultaneous devices, the possibility of independent waveform annotation and a HL7 compliant structure for common contents. These features represent an enhanced integration with existent systems and an improved flexibility for cardiac data representation.
Resumo:
The importance of wind power energy for energy and environmental policies has been growing in past recent years. However, because of its random nature over time, the wind generation cannot be reliable dispatched and perfectly forecasted, becoming a challenge when integrating this production in power systems. In addition the wind energy has to cope with the diversity of production resulting from alternative wind power profiles located in different regions. In 2012, Portugal presented a cumulative installed capacity distributed over 223 wind farms [1]. In this work the circular data statistical methods are used to analyze and compare alternative spatial wind generation profiles. Variables indicating extreme situations are analyzed. The hour (s) of the day where the farm production attains its maximum daily production is considered. This variable was converted into circular variable, and the use of circular statistics enables to identify the daily hour distribution for different wind production profiles. This methodology was applied to a real case, considering data from the Portuguese power system regarding the year 2012 with a 15-minutes interval. Six geographical locations were considered, representing different wind generation profiles in the Portuguese system.In this work the circular data statistical methods are used to analyze and compare alternative spatial wind generation profiles. Variables indicating extreme situations are analyzed. The hour (s) of the day where the farm production attains its maximum daily production is considered. This variable was converted into circular variable, and the use of circular statistics enables to identify the daily hour distribution for different wind production profiles. This methodology was applied to a real case, considering data from the Portuguese power system regarding the year 2012 with a 15-minutes interval. Six geographical locations were considered, representing different wind generation profiles in the Portuguese system.
Resumo:
Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do Grau de Mestre em Energias Renováveis – Conversão Eléctrica e Utilização Sustentáveis
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.