618 resultados para Scaled Models.
Resumo:
The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.
Resumo:
Velocity jump processes are discrete random walk models that have many applications including the study of biological and ecological collective motion. In particular, velocity jump models are often used to represent a type of persistent motion, known as a “run and tumble”, which is exhibited by some isolated bacteria cells. All previous velocity jump processes are non-interacting, which means that crowding effects and agent-to-agent interactions are neglected. By neglecting these agent-to-agent interactions, traditional velocity jump models are only applicable to very dilute systems. Our work is motivated by the fact that many applications in cell biology, such as wound healing, cancer invasion and development, often involve tissues that are densely packed with cells where cell-to-cell contact and crowding effects can be important. To describe these kinds of high cell density problems using a velocity jump process we introduce three different classes of crowding interactions into a one-dimensional model. Simulation data and averaging arguments lead to a suite of continuum descriptions of the interacting velocity jump processes. We show that the resulting systems of hyperbolic partial differential equations predict the mean behavior of the stochastic simulations very well.
Resumo:
We propose to use the Tensor Space Modeling (TSM) to represent and analyze the user’s web log data that consists of multiple interests and spans across multiple dimensions. Further we propose to use the decomposition factors of the Tensors for clustering the users based on similarity of search behaviour. Preliminary results show that the proposed method outperforms the traditional Vector Space Model (VSM) based clustering.
Resumo:
Previous research has put forward a number of properties of business process models that have an impact on their understandability. Two such properties are compactness and(block-)structuredness. What has not been sufficiently appreciated at this point is that these desirable properties may be at odds with one another. This paper presents the results of a two-pronged study aimed at exploring the trade-off between compactness and structuredness of process models. The first prong of the study is a comparative analysis of the complexity of a set of unstructured process models from industrial practice and of their corresponding structured versions. The second prong is an experiment wherein a cohort of students was exposed to semantically equivalent unstructured and structured process models. The key finding is that structuredness is not an absolute desideratum vis-a-vis for process model understandability. Instead, subtle trade-offs between structuredness and other model properties are at play.
Resumo:
Evaluating the safety of different traffic facilities is a complex and crucial task. Microscopic simulation models have been widely used for traffic management but have been largely neglected in traffic safety studies. Micro simulation to study safety is more ethical and accessible than the traditional safety studies, which only assess historical crash data. However, current microscopic models are unable to mimic unsafe driver behavior, as they are based on presumptions of safe driver behavior. This highlights the need for a critical examination of the current microscopic models to determine which components and parameters have an effect on safety indicator reproduction. The question then arises whether these safety indicators are valid indicators of traffic safety. The safety indicators were therefore selected and tested for straight motorway segments in Brisbane, Australia. This test examined the capability of a micro-simulation model and presents a better understanding of micro-simulation models and how such models, in particular car following models can be enriched to present more accurate safety indicators.
Resumo:
Non-invasive vibration analysis has been used extensively to monitor the progression of dental implant healing and stabilization. It is now being considered as a method to monitor femoral implants in transfemoral amputees. This paper evaluates two modal analysis excitation methods and investigates their capabilities in detecting changes at the interface between the implant and the bone that occur during osseointegration. Excitation of bone-implant physical models with the electromagnetic shaker provided higher coherence values and a greater number of modes over the same frequency range when compared to the impact hammer. Differences were detected in the natural frequencies and fundamental mode shape of the model when the fit of the implant was altered in the bone. The ability to detect changes in the model dynamic properties demonstrates the potential of modal analysis in this application and warrants further investigation.
Resumo:
With the increasing number of XML documents in varied domains, it has become essential to identify ways of finding interesting information from these documents. Data mining techniques were used to derive this interesting information. Mining on XML documents is impacted by its model due to the semi-structured nature of these documents. Hence, in this chapter we present an overview of the various models of XML documents, how these models were used for mining and some of the issues and challenges in these models. In addition, this chapter also provides some insights into the future models of XML documents for effectively capturing the two important features namely structure and content of XML documents for mining.
Resumo:
Existing recommendation systems often recommend products to users by capturing the item-to-item and user-to-user similarity measures. These types of recommendation systems become inefficient in people-to-people networks for people to people recommendation that require two way relationship. Also, existing recommendation methods use traditional two dimensional models to find inter relationships between alike users and items. It is not efficient enough to model the people-to-people network with two-dimensional models as the latent correlations between the people and their attributes are not utilized. In this paper, we propose a novel tensor decomposition-based recommendation method for recommending people-to-people based on users profiles and their interactions. The people-to-people network data is multi-dimensional data which when modeled using vector based methods tend to result in information loss as they capture either the interactions or the attributes of the users but not both the information. This paper utilizes tensor models that have the ability to correlate and find latent relationships between similar users based on both information, user interactions and user attributes, in order to generate recommendations. Empirical analysis is conducted on a real-life online dating dataset. As demonstrated in results, the use of tensor modeling and decomposition has enabled the identification of latent correlations between people based on their attributes and interactions in the network and quality recommendations have been derived using the 'alike' users concept.
Resumo:
Continuum, partial differential equation models are often used to describe the collective motion of cell populations, with various types of motility represented by the choice of diffusion coefficient, and cell proliferation captured by the source terms. Previously, the choice of diffusion coefficient has been largely arbitrary, with the decision to choose a particular linear or nonlinear form generally based on calibration arguments rather than making any physical connection with the underlying individual-level properties of the cell motility mechanism. In this work we provide a new link between individual-level models, which account for important cell properties such as varying cell shape and volume exclusion, and population-level partial differential equation models. We work in an exclusion process framework, considering aligned, elongated cells that may occupy more than one lattice site, in order to represent populations of agents with different sizes. Three different idealizations of the individual-level mechanism are proposed, and these are connected to three different partial differential equations, each with a different diffusion coefficient; one linear, one nonlinear and degenerate and one nonlinear and nondegenerate. We test the ability of these three models to predict the population level response of a cell spreading problem for both proliferative and nonproliferative cases. We also explore the potential of our models to predict long time travelling wave invasion rates and extend our results to two dimensional spreading and invasion. Our results show that each model can accurately predict density data for nonproliferative systems, but that only one does so for proliferative systems. Hence great care must be taken to predict density data for with varying cell shape.
Resumo:
The quality of conceptual business process models is highly relevant for the design of corresponding information systems. In particular, a precise measurement of model characteristics can be beneficial from a business perspective, helping to save costs thanks to early error detection. This is just as true from a software engineering point of view. In this latter case, models facilitate stakeholder communication and software system design. Research has investigated several proposals as regards measures for business process models, from a rather correlational perspective. This is helpful for understanding, for example size and complexity as general driving forces of error probability. Yet, design decisions usually have to build on thresholds, which can reliably indicate that a certain counter-action has to be taken. This cannot be achieved only by providing measures; it requires a systematic identification of effective and meaningful thresholds. In this paper, we derive thresholds for a set of structural measures for predicting errors in conceptual process models. To this end, we use a collection of 2,000 business process models from practice as a means of determining thresholds, applying an adaptation of the ROC curves method. Furthermore, an extensive validation of the derived thresholds was conducted by using 429 EPC models from an Australian financial institution. Finally, significant thresholds were adapted to refine existing modeling guidelines in a quantitative way.
Resumo:
Many modern business environments employ software to automate the delivery of workflows; whereas, workflow design and generation remains a laborious technical task for domain specialists. Several differ- ent approaches have been proposed for deriving workflow models. Some approaches rely on process data mining approaches, whereas others have proposed derivations of workflow models from operational struc- tures, domain specific knowledge or workflow model compositions from knowledge-bases. Many approaches draw on principles from automatic planning, but conceptual in context and lack mathematical justification. In this paper we present a mathematical framework for deducing tasks in workflow models from plans in mechanistic or strongly controlled work environments, with a focus around automatic plan generations. In addition, we prove an associative composition operator that permits crisp hierarchical task compositions for workflow models through a set of mathematical deduction rules. The result is a logical framework that can be used to prove tasks in workflow hierarchies from operational information about work processes and machine configurations in controlled or mechanistic work environments.
Resumo:
Nowadays, business process management is an important approach for managing organizations from an operational perspective. As a consequence, it is common to see organizations develop collections of hundreds or even thousands of business process models. Such large collections of process models bring new challenges and provide new opportunities, as the knowledge that they encapsulate requires to be properly managed. Therefore, a variety of techniques for managing large collections of business process models is being developed. The goal of this paper is to provide an overview of the management techniques that currently exist, as well as the open research challenges that they pose.
Resumo:
With the growing number of XML documents on theWeb it becomes essential to effectively organise these XML documents in order to retrieve useful information from them. A possible solution is to apply clustering on the XML documents to discover knowledge that promotes effective data management, information retrieval and query processing. However, many issues arise in discovering knowledge from these types of semi-structured documents due to their heterogeneity and structural irregularity. Most of the existing research on clustering techniques focuses only on one feature of the XML documents, this being either their structure or their content due to scalability and complexity problems. The knowledge gained in the form of clusters based on the structure or the content is not suitable for reallife datasets. It therefore becomes essential to include both the structure and content of XML documents in order to improve the accuracy and meaning of the clustering solution. However, the inclusion of both these kinds of information in the clustering process results in a huge overhead for the underlying clustering algorithm because of the high dimensionality of the data. The overall objective of this thesis is to address these issues by: (1) proposing methods to utilise frequent pattern mining techniques to reduce the dimension; (2) developing models to effectively combine the structure and content of XML documents; and (3) utilising the proposed models in clustering. This research first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. A clustering framework with two types of models, implicit and explicit, is developed. The implicit model uses a Vector Space Model (VSM) to combine the structure and the content information. The explicit model uses a higher order model, namely a 3- order Tensor Space Model (TSM), to explicitly combine the structure and the content information. This thesis also proposes a novel incremental technique to decompose largesized tensor models to utilise the decomposed solution for clustering the XML documents. The proposed framework and its components were extensively evaluated on several real-life datasets exhibiting extreme characteristics to understand the usefulness of the proposed framework in real-life situations. Additionally, this research evaluates the outcome of the clustering process on the collection selection problem in the information retrieval on the Wikipedia dataset. The experimental results demonstrate that the proposed frequent pattern mining and clustering methods outperform the related state-of-the-art approaches. In particular, the proposed framework of utilising frequent structures for constraining the content shows an improvement in accuracy over content-only and structure-only clustering results. The scalability evaluation experiments conducted on large scaled datasets clearly show the strengths of the proposed methods over state-of-the-art methods. In particular, this thesis work contributes to effectively combining the structure and the content of XML documents for clustering, in order to improve the accuracy of the clustering solution. In addition, it also contributes by addressing the research gaps in frequent pattern mining to generate efficient and concise frequent subtrees with various node relationships that could be used in clustering.
Resumo:
There is an intimate interconnectivity between policy guidelines defining reform and the delineation of what research methods would be subsequently applied to determine reform success. Research is guided as much by the metaphors describing it as by the ensuing empirical definition of actions of results obtained from it. In a call for different reform policy metaphors Lumby and English (2010) note, “The primary responsibility for the parlous state of education... lies with the policy makers that have racked our schools with reductive and dehumanizing processes, following the metaphors of market efficiency, and leadership models based on accounting and the characteristics of machine bureaucracy” (p. 127)
Resumo:
This paper presents an approach to building an observation likelihood function from a set of sparse, noisy training observations taken from known locations by a sensor with no obvious geometric model. The basic approach is to fit an interpolant to the training data, representing the expected observation, and to assume additive sensor noise. This paper takes a Bayesian view of the problem, maintaining a posterior over interpolants rather than simply the maximum-likelihood interpolant, giving a measure of uncertainty in the map at any point. This is done using a Gaussian process framework. To validate the approach experimentally, a model of an environment is built using observations from an omni-directional camera. After a model has been built from the training data, a particle filter is used to localise while traversing this environment