894 resultados para Data modelling


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Safety on public transport is a major concern for the relevant authorities. We
address this issue by proposing an automated surveillance platform which combines data from video, infrared and pressure sensors. Data homogenisation and integration is achieved by a distributed architecture based on communication middleware that resolves interconnection issues, thereby enabling data modelling. A common-sense knowledge base models and encodes knowledge about public-transport platforms and the actions and activities of passengers. Trajectory data from passengers is modelled as a time-series of human activities. Common-sense knowledge and rules are then applied to detect inconsistencies or errors in the data interpretation. Lastly, the rationality that characterises human behaviour is also captured here through a bottom-up Hierarchical Task Network planner that, along with common-sense, corrects misinterpretations to explain passenger behaviour. The system is validated using a simulated bus saloon scenario as a case-study. Eighteen video sequences were recorded with up to six passengers. Four metrics were used to evaluate performance. The system, with an accuracy greater than 90% for each of the four metrics, was found to outperform a rule-base system and a system containing planning alone.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Matrix population models, elasticity analysis and loop analysis can potentially provide powerful techniques for the analysis of life histories. Data from a capture-recapture study on a population of southern highland water skinks (Eulamprus tympanum) were used to construct a matrix population model. Errors in elasticities were calculated by using the parametric bootstrap technique. Elasticity and loop analyses were then conducted to identify the life history stages most important to fitness. The same techniques were used to investigate the relative importance of fast versus slow growth, and rapid versus delayed reproduction. Mature water skinks were long-lived, but there was high immature mortality. The most sensitive life history stage was the subadult stage. It is suggested that life history evolution in E. tympanum may be strongly affected by predation, particularly by birds. Because our population declined over the study, slow growth and delayed reproduction were the optimal life history strategies over this period. Although the techniques of evolutionary demography provide a powerful approach for the analysis of life histories, there are formidable logistical obstacles in gathering enough high-quality data for robust estimates of the critical parameters.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Examples from the Murray-Darling basin in Australia are used to illustrate different methods of disaggregation of reconnaissance-scale maps. One approach for disaggregation revolves around the de-convolution of the soil-landscape paradigm elaborated during a soil survey. The descriptions of soil ma units and block diagrams in a soil survey report detail soil-landscape relationships or soil toposequences that can be used to disaggregate map units into component landscape elements. Toposequences can be visualised on a computer by combining soil maps with digital elevation data. Expert knowledge or statistics can be used to implement the disaggregation. Use of a restructuring element and k-means clustering are illustrated. Another approach to disaggregation uses training areas to develop rules to extrapolate detailed mapping into other, larger areas where detailed mapping is unavailable. A two-level decision tree example is presented. At one level, the decision tree method is used to capture mapping rules from the training area; at another level, it is used to define the domain over which those rules can be extrapolated. (C) 2001 Elsevier Science B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The principle of using induction rules based on spatial environmental data to model a soil map has previously been demonstrated Whilst the general pattern of classes of large spatial extent and those with close association with geology were delineated small classes and the detailed spatial pattern of the map were less well rendered Here we examine several strategies to improve the quality of the soil map models generated by rule induction Terrain attributes that are better suited to landscape description at a resolution of 250 m are introduced as predictors of soil type A map sampling strategy is developed Classification error is reduced by using boosting rather than cross validation to improve the model Further the benefit of incorporating the local spatial context for each environmental variable into the rule induction is examined The best model was achieved by sampling in proportion to the spatial extent of the mapped classes boosting the decision trees and using spatial contextual information extracted from the environmental variables.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The present paper addresses two major concerns that were identified when developing neural network based prediction models and which can limit their wider applicability in the industry. The first problem is that it appears neural network models are not readily available to a corrosion engineer. Therefore the first part of this paper describes a neural network model of CO2 corrosion which was created using a standard commercial software package and simple modelling strategies. It was found that such a model was able to capture practically all of the trends noticed in the experimental data with acceptable accuracy. This exercise has proven that a corrosion engineer could readily develop a neural network model such as the one described below for any problem at hand, given that sufficient experimental data exist. This applies even in the cases when the understanding of the underlying processes is poor. The second problem arises from cases when all the required inputs for a model are not known or can be estimated with a limited degree of accuracy. It seems advantageous to have models that can take as input a range rather than a single value. One such model, based on the so-called Monte Carlo approach, is presented. A number of comparisons are shown which have illustrated how a corrosion engineer might use this approach to rapidly test the sensitivity of a model to the uncertainities associated with the input parameters. (C) 2001 Elsevier Science Ltd. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We focus on mixtures of factor analyzers from the perspective of a method for model-based density estimation from high-dimensional data, and hence for the clustering of such data. This approach enables a normal mixture model to be fitted to a sample of n data points of dimension p, where p is large relative to n. The number of free parameters is controlled through the dimension of the latent factor space. By working in this reduced space, it allows a model for each component-covariance matrix with complexity lying between that of the isotropic and full covariance structure models. We shall illustrate the use of mixtures of factor analyzers in a practical example that considers the clustering of cell lines on the basis of gene expressions from microarray experiments. (C) 2002 Elsevier Science B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We compare Bayesian methodology utilizing free-ware BUGS (Bayesian Inference Using Gibbs Sampling) with the traditional structural equation modelling approach based on another free-ware package, Mx. Dichotomous and ordinal (three category) twin data were simulated according to different additive genetic and common environment models for phenotypic variation. Practical issues are discussed in using Gibbs sampling as implemented by BUGS to fit subject-specific Bayesian generalized linear models, where the components of variation may be estimated directly. The simulation study (based on 2000 twin pairs) indicated that there is a consistent advantage in using the Bayesian method to detect a correct model under certain specifications of additive genetics and common environmental effects. For binary data, both methods had difficulty in detecting the correct model when the additive genetic effect was low (between 10 and 20%) or of moderate range (between 20 and 40%). Furthermore, neither method could adequately detect a correct model that included a modest common environmental effect (20%) even when the additive genetic effect was large (50%). Power was significantly improved with ordinal data for most scenarios, except for the case of low heritability under a true ACE model. We illustrate and compare both methods using data from 1239 twin pairs over the age of 50 years, who were registered with the Australian National Health and Medical Research Council Twin Registry (ATR) and presented symptoms associated with osteoarthritis occurring in joints of the hand.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

25th Annual Conference of the European Cetacean Society, Cadiz, Spain 21-23 March 2011.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this study, we concentrate on modelling gross primary productivity using two simple approaches to simulate canopy photosynthesis: "big leaf" and "sun/shade" models. Two approaches for calibration are used: scaling up of canopy photosynthetic parameters from the leaf to the canopy level and fitting canopy biochemistry to eddy covariance fluxes. Validation of the models is achieved by using eddy covariance data from the LBA site C14. Comparing the performance of both models we conclude that numerically (in terms of goodness of fit) and qualitatively, (in terms of residual response to different environmental variables) sun/shade does a better job. Compared to the sun/shade model, the big leaf model shows a lower goodness of fit and fails to respond to variations in the diffuse fraction, also having skewed responses to temperature and VPD. The separate treatment of sun and shade leaves in combination with the separation of the incoming light into direct beam and diffuse make sun/shade a strong modelling tool that catches more of the observed variability in canopy fluxes as measured by eddy covariance. In conclusion, the sun/shade approach is a relatively simple and effective tool for modelling photosynthetic carbon uptake that could be easily included in many terrestrial carbon models.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Programa Doutoral em Matemática e Aplicações.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper describes the concept, technical realisation and validation of a largely data-driven method to model events with Z→ττ decays. In Z→μμ events selected from proton-proton collision data recorded at s√=8 TeV with the ATLAS experiment at the LHC in 2012, the Z decay muons are replaced by τ leptons from simulated Z→ττ decays at the level of reconstructed tracks and calorimeter cells. The τ lepton kinematics are derived from the kinematics of the original muons. Thus, only the well-understood decays of the Z boson and τ leptons as well as the detector response to the τ decay products are obtained from simulation. All other aspects of the event, such as the Z boson and jet kinematics as well as effects from multiple interactions, are given by the actual data. This so-called τ-embedding method is particularly relevant for Higgs boson searches and analyses in ττ final states, where Z→ττ decays constitute a large irreducible background that cannot be obtained directly from data control samples.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Tese de Doutoramento em Ciências (Especialidade em Matemática)