894 resultados para Data modelling
Resumo:
This paper investigates the use of ensemble of predictors in order to improve the performance of spatial prediction methods. Support vector regression (SVR), a popular method from the field of statistical machine learning, is used. Several instances of SVR are combined using different data sampling schemes (bagging and boosting). Bagging shows good performance, and proves to be more computationally efficient than training a single SVR model while reducing error. Boosting, however, does not improve results on this specific problem.
Resumo:
A unified approach is proposed for data modelling that includes supervised regression and classification applications as well as unsupervised probability density function estimation. The orthogonal-least-squares regression based on the leave-one-out test criteria is formulated within this unified data-modelling framework to construct sparse kernel models that generalise well. Examples from regression, classification and density estimation applications are used to illustrate the effectiveness of this generic data-modelling approach for constructing parsimonious kernel models with excellent generalisation capability. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Subsidence is a natural hazard that affects wide areas in the world causing important economic costs annually. This phenomenon has occurred in the metropolitan area of Murcia City (SE Spain) as a result of groundwater overexploitation. In this work aquifer system subsidence is investigated using an advanced differential SAR interferometry remote sensing technique (A-DInSAR) called Stable Point Network (SPN). The SPN derived displacement results, mainly the velocity displacement maps and the time series of the displacement, reveal that in the period 2004–2008 the rate of subsidence in Murcia metropolitan area doubled with respect to the previous period from 1995 to 2005. The acceleration of the deformation phenomenon is explained by the drought period started in 2006. The comparison of the temporal evolution of the displacements measured with the extensometers and the SPN technique shows an average absolute error of 3.9±3.8 mm. Finally, results from a finite element model developed to simulate the recorded time history subsidence from known water table height changes compares well with the SPN displacement time series estimations. This result demonstrates the potential of A-DInSAR techniques to validate subsidence prediction models as an alternative to using instrumental ground based techniques for validation.
Resumo:
Gli obiettivi di questi tesi sono lo studio comparativo di alcuni DBMS non relazionali e il confronto di diverse soluzioni di modellazione logica e fisica per database non relazionali. Utilizzando come sistemi di gestione due DBMS Document-based non relazionali, MongoDB e CouchDB, ed un DBMS relazionale, Oracle, sarà effettuata un’analisi di diverse soluzione di modellazione logica dei dati in database documentali e uno studio mirato alla scelta degli attributi sui quali costruire indici. In primo luogo verrà definito un semplice caso di studio su cui effettuare i confronto, basato su due entità in relazione 1:N, sulle quali sarà costruito un opportuno carico di lavoro. Idatabase non relazionali sono schema-less, senza schema fisso, ed esiste una libertà maggiore di modellazione. In questo lavoro di tesi i dati verranno modellati secondo le tecniche del Referencing ed Embedding che consistono rispettivamente nell’inserimento di una chiave (riferimento) oppure di un intero sotto-documento (embedding) all’interno di un documento per poter esprimere il concetto di relazione tra diverse entità. Per studiare l’opportunità di indicizzare un attributo, ciascuna entità sarà poi composta da due triplette uguali di attributi definiti con differenti livelli di selettività, con la differenza che su ciascun attributo della seconda sarà costruito un indice. Il carico di lavoro sarà costituito da query definite in modo da poter testare le diverse modellazioni includendo anche predicati di join che non sono solitamente contemplati in modelli documentali. Per ogni tipo di database verranno eseguite le query e registrati i tempi, in modo da poter confrontare le performance dei diversi DBMS sulla base delle operazioni CRUD.
Resumo:
The research aimed to establish tyre-road noise models by using a Data Mining approach that allowed to build a predictive model and assess the importance of the tested input variables. The data modelling took into account three learning algorithms and three metrics to define the best predictive model. The variables tested included basic properties of pavement surfaces, macrotexture, megatexture, and uneven- ness and, for the first time, damping. Also, the importance of those variables was measured by using a sensitivity analysis procedure. Two types of models were set: one with basic variables and another with complex variables, such as megatexture and damping, all as a function of vehicles speed. More detailed models were additionally set by the speed level. As a result, several models with very good tyre-road noise predictive capacity were achieved. The most relevant variables were Speed, Temperature, Aggregate size, Mean Profile Depth, and Damping, which had the highest importance, even though influenced by speed. Megatexture and IRI had the lowest importance. The applicability of the models developed in this work is relevant for trucks tyre-noise prediction, represented by the AVON V4 test tyre, at the early stage of road pavements use. Therefore, the obtained models are highly useful for the design of pavements and for noise prediction by road authorities and contractors.
Resumo:
A complete workflow specification requires careful integration of many different process characteristics. Decisions must be made as to the definitions of individual activities, their scope, the order of execution that maintains the overall business process logic, the rules governing the discipline of work list scheduling to performers, identification of time constraints and more. The goal of this paper is to address an important issue in workflows modelling and specification, which is data flow, its modelling, specification and validation. Researchers have neglected this dimension of process analysis for some time, mainly focussing on structural considerations with limited verification checks. In this paper, we identify and justify the importance of data modelling in overall workflows specification and verification. We illustrate and define several potential data flow problems that, if not detected prior to workflow deployment may prevent the process from correct execution, execute process on inconsistent data or even lead to process suspension. A discussion on essential requirements of the workflow data model in order to support data validation is also given..
Resumo:
The algorithmic approach to data modelling has developed rapidly these last years, in particular methods based on data mining and machine learning have been used in a growing number of applications. These methods follow a data-driven methodology, aiming at providing the best possible generalization and predictive abilities instead of concentrating on the properties of the data model. One of the most successful groups of such methods is known as Support Vector algorithms. Following the fruitful developments in applying Support Vector algorithms to spatial data, this paper introduces a new extension of the traditional support vector regression (SVR) algorithm. This extension allows for the simultaneous modelling of environmental data at several spatial scales. The joint influence of environmental processes presenting different patterns at different scales is here learned automatically from data, providing the optimum mixture of short and large-scale models. The method is adaptive to the spatial scale of the data. With this advantage, it can provide efficient means to model local anomalies that may typically arise in situations at an early phase of an environmental emergency. However, the proposed approach still requires some prior knowledge on the possible existence of such short-scale patterns. This is a possible limitation of the method for its implementation in early warning systems. The purpose of this paper is to present the multi-scale SVR model and to illustrate its use with an application to the mapping of Cs137 activity given the measurements taken in the region of Briansk following the Chernobyl accident.
Resumo:
A unified approach is proposed for sparse kernel data modelling that includes regression and classification as well as probability density function estimation. The orthogonal-least-squares forward selection method based on the leave-one-out test criteria is presented within this unified data-modelling framework to construct sparse kernel models that generalise well. Examples from regression, classification and density estimation applications are used to illustrate the effectiveness of this generic sparse kernel data modelling approach.
Resumo:
A basic principle in data modelling is to incorporate available a priori information regarding the underlying data generating mechanism into the modelling process. We adopt this principle and consider grey-box radial basis function (RBF) modelling capable of incorporating prior knowledge. Specifically, we show how to explicitly incorporate the two types of prior knowledge: the underlying data generating mechanism exhibits known symmetric property and the underlying process obeys a set of given boundary value constraints. The class of orthogonal least squares regression algorithms can readily be applied to construct parsimonious grey-box RBF models with enhanced generalisation capability.
Resumo:
In this paper,the Prony's method is applied to the time-domain waveform data modelling in the presence of noise.The following three problems encountered in this work are studied:(1)determination of the order of waveform;(2)de-termination of numbers of multiple roots;(3)determination of the residues.The methods of solving these problems are given and simulated on the computer.Finally,an output pulse of model PG-10N signal generator and the distorted waveform obtained by transmitting the pulse above mentioned through a piece of coaxial cable are modelled,and satisfactory results are obtained.So the effectiveness of Prony's method in waveform data modelling in the presence of noise is confirmed.
Resumo:
A fundamental principle in data modelling is to incorporate available a priori information regarding the underlying data generating mechanism into the modelling process. We adopt this principle and consider grey-box radial basis function (RBF) modelling capable of incorporating prior knowledge. Specifically, we show how to explicitly incorporate the two types of prior knowledge: (i) the underlying data generating mechanism exhibits known symmetric property, and (ii) the underlying process obeys a set of given boundary value constraints. The class of efficient orthogonal least squares regression algorithms can readily be applied without any modification to construct parsimonious grey-box RBF models with enhanced generalisation capability.
Resumo:
Simulation modelling has been used for many years in the manufacturing sector but has now become a mainstream tool in business situations. This is partly because of the popularity of business process re-engineering (BPR) and other process based improvement methods that use simulation to help analyse changes in process design. This textbook includes case studies in both manufacturing and service situations to demonstrate the usefulness of the approach. A further reason for the increasing popularity of the technique is the development of business orientated and user-friendly Windows-based software. This text provides a guide to the use of ARENA, SIMUL8 and WITNESS simulation software systems that are widely used in industry and available to students. Overall this text provides a practical guide to building and implementing the results from a simulation model. All the steps in a typical simulation study are covered including data collection, input data modelling and experimentation.