888 resultados para Web modelling methods


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Web transaction data between Web visitors and Web functionalities usually convey user task-oriented behavior pattern. Mining such type of click-stream data will lead to capture usage pattern information. Nowadays Web usage mining technique has become one of most widely used methods for Web recommendation, which customizes Web content to user-preferred style. Traditional techniques of Web usage mining, such as Web user session or Web page clustering, association rule and frequent navigational path mining can only discover usage pattern explicitly. They, however, cannot reveal the underlying navigational activities and identify the latent relationships that are associated with the patterns among Web users as well as Web pages. In this work, we propose a Web recommendation framework incorporating Web usage mining technique based on Probabilistic Latent Semantic Analysis (PLSA) model. The main advantages of this method are, not only to discover usage-based access pattern, but also to reveal the underlying latent factor as well. With the discovered user access pattern, we then present user more interested content via collaborative recommendation. To validate the effectiveness of proposed approach, we conduct experiments on real world datasets and make comparisons with some existing traditional techniques. The preliminary experimental results demonstrate the usability of the proposed approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Numerical modelling has been used to examine the relationship between the results of two commonly used methods of assessing the propensity of coal to spontaneous combustion, the R70 and Relative Ignition Temperature tests, and the likely behaviour in situ. The criticality of various parameters has been examined and a method of utilising critical self-heating parameters has been developed. This study shows that on their own, the laboratory test results do not provide a reliable guide to in situ behaviour but can be used in combination to considerably increase the ability to predict spontaneous combustion behaviour.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Biologists are increasingly conscious of the critical role that noise plays in cellular functions such as genetic regulation, often in connection with fluctuations in small numbers of key regulatory molecules. This has inspired the development of models that capture this fundamentally discrete and stochastic nature of cellular biology - most notably the Gillespie stochastic simulation algorithm (SSA). The SSA simulates a temporally homogeneous, discrete-state, continuous-time Markov process, and of course the corresponding probabilities and numbers of each molecular species must all remain positive. While accurately serving this purpose, the SSA can be computationally inefficient due to very small time stepping so faster approximations such as the Poisson and Binomial τ-leap methods have been suggested. This work places these leap methods in the context of numerical methods for the solution of stochastic differential equations (SDEs) driven by Poisson noise. This allows analogues of Euler-Maruyuma, Milstein and even higher order methods to be developed through the Itô-Taylor expansions as well as similar derivative-free Runge-Kutta approaches. Numerical results demonstrate that these novel methods compare favourably with existing techniques for simulating biochemical reactions by more accurately capturing crucial properties such as the mean and variance than existing methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Time-course experiments with microarrays are often used to study dynamic biological systems and genetic regulatory networks (GRNs) that model how genes influence each other in cell-level development of organisms. The inference for GRNs provides important insights into the fundamental biological processes such as growth and is useful in disease diagnosis and genomic drug design. Due to the experimental design, multilevel data hierarchies are often present in time-course gene expression data. Most existing methods, however, ignore the dependency of the expression measurements over time and the correlation among gene expression profiles. Such independence assumptions violate regulatory interactions and can result in overlooking certain important subject effects and lead to spurious inference for regulatory networks or mechanisms. In this paper, a multilevel mixed-effects model is adopted to incorporate data hierarchies in the analysis of time-course data, where temporal and subject effects are both assumed to be random. The method starts with the clustering of genes by fitting the mixture model within the multilevel random-effects model framework using the expectation-maximization (EM) algorithm. The network of regulatory interactions is then determined by searching for regulatory control elements (activators and inhibitors) shared by the clusters of co-expressed genes, based on a time-lagged correlation coefficients measurement. The method is applied to two real time-course datasets from the budding yeast (Saccharomyces cerevisiae) genome. It is shown that the proposed method provides clusters of cell-cycle regulated genes that are supported by existing gene function annotations, and hence enables inference on regulatory interactions for the genetic network.

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The developments of models in Earth Sciences, e.g. for earthquake prediction and for the simulation of mantel convection, are fare from being finalized. Therefore there is a need for a modelling environment that allows scientist to implement and test new models in an easy but flexible way. After been verified, the models should be easy to apply within its scope, typically by setting input parameters through a GUI or web services. It should be possible to link certain parameters to external data sources, such as databases and other simulation codes. Moreover, as typically large-scale meshes have to be used to achieve appropriate resolutions, the computational efficiency of the underlying numerical methods is important. Conceptional this leads to a software system with three major layers: the application layer, the mathematical layer, and the numerical algorithm layer. The latter is implemented as a C/C++ library to solve a basic, computational intensive linear problem, such as a linear partial differential equation. The mathematical layer allows the model developer to define his model and to implement high level solution algorithms (e.g. Newton-Raphson scheme, Crank-Nicholson scheme) or choose these algorithms form an algorithm library. The kernels of the model are generic, typically linear, solvers provided through the numerical algorithm layer. Finally, to provide an easy-to-use application environment, a web interface is (semi-automatically) built to edit the XML input file for the modelling code. In the talk, we will discuss the advantages and disadvantages of this concept in more details. We will also present the modelling environment escript which is a prototype implementation toward such a software system in Python (see www.python.org). Key components of escript are the Data class and the PDE class. Objects of the Data class allow generating, holding, accessing, and manipulating data, in such a way that the actual, in the particular context best, representation is transparent to the user. They are also the key to establish connections with external data sources. PDE class objects are describing (linear) partial differential equation objects to be solved by a numerical library. The current implementation of escript has been linked to the finite element code Finley to solve general linear partial differential equations. We will give a few simple examples which will illustrate the usage escript. Moreover, we show the usage of escript together with Finley for the modelling of interacting fault systems and for the simulation of mantel convection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A formalism for modelling the dynamics of Genetic Algorithms (GAs) using methods from statistical mechanics, originally due to Prugel-Bennett and Shapiro, is reviewed, generalized and improved upon. This formalism can be used to predict the averaged trajectory of macroscopic statistics describing the GA's population. These macroscopics are chosen to average well between runs, so that fluctuations from mean behaviour can often be neglected. Where necessary, non-trivial terms are determined by assuming maximum entropy with constraints on known macroscopics. Problems of realistic size are described in compact form and finite population effects are included, often proving to be of fundamental importance. The macroscopics used here are cumulants of an appropriate quantity within the population and the mean correlation (Hamming distance) within the population. Including the correlation as an explicit macroscopic provides a significant improvement over the original formulation. The formalism is applied to a number of simple optimization problems in order to determine its predictive power and to gain insight into GA dynamics. Problems which are most amenable to analysis come from the class where alleles within the genotype contribute additively to the phenotype. This class can be treated with some generality, including problems with inhomogeneous contributions from each site, non-linear or noisy fitness measures, simple diploid representations and temporally varying fitness. The results can also be applied to a simple learning problem, generalization in a binary perceptron, and a limit is identified for which the optimal training batch size can be determined for this problem. The theory is compared to averaged results from a real GA in each case, showing excellent agreement if the maximum entropy principle holds. Some situations where this approximation brakes down are identified. In order to fully test the formalism, an attempt is made on the strong sc np-hard problem of storing random patterns in a binary perceptron. Here, the relationship between the genotype and phenotype (training error) is strongly non-linear. Mutation is modelled under the assumption that perceptron configurations are typical of perceptrons with a given training error. Unfortunately, this assumption does not provide a good approximation in general. It is conjectured that perceptron configurations would have to be constrained by other statistics in order to accurately model mutation for this problem. Issues arising from this study are discussed in conclusion and some possible areas of further research are outlined.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A conventional neural network approach to regression problems approximates the conditional mean of the output vector. For mappings which are multi-valued this approach breaks down, since the average of two solutions is not necessarily a valid solution. In this article mixture density networks, a principled method to model conditional probability density functions, are applied to retrieving Cartesian wind vector components from satellite scatterometer data. A hybrid mixture density network is implemented to incorporate prior knowledge of the predominantly bimodal function branches. An advantage of a fully probabilistic model is that more sophisticated and principled methods can be used to resolve ambiguities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A Bayesian procedure for the retrieval of wind vectors over the ocean using satellite borne scatterometers requires realistic prior near-surface wind field models over the oceans. We have implemented carefully chosen vector Gaussian Process models; however in some cases these models are too smooth to reproduce real atmospheric features, such as fronts. At the scale of the scatterometer observations, fronts appear as discontinuities in wind direction. Due to the nature of the retrieval problem a simple discontinuity model is not feasible, and hence we have developed a constrained discontinuity vector Gaussian Process model which ensures realistic fronts. We describe the generative model and show how to compute the data likelihood given the model. We show the results of inference using the model with Markov Chain Monte Carlo methods on both synthetic and real data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The deficiencies of stationary models applied to financial time series are well documented. A special form of non-stationarity, where the underlying generator switches between (approximately) stationary regimes, seems particularly appropriate for financial markets. We use a dynamic switching (modelled by a hidden Markov model) combined with a linear dynamical system in a hybrid switching state space model (SSSM) and discuss the practical details of training such models with a variational EM algorithm due to [Ghahramani and Hilton,1998]. The performance of the SSSM is evaluated on several financial data sets and it is shown to improve on a number of existing benchmark methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A conventional neural network approach to regression problems approximates the conditional mean of the output vector. For mappings which are multi-valued this approach breaks down, since the average of two solutions is not necessarily a valid solution. In this article mixture density networks, a principled method to model conditional probability density functions, are applied to retrieving Cartesian wind vector components from satellite scatterometer data. A hybrid mixture density network is implemented to incorporate prior knowledge of the predominantly bimodal function branches. An advantage of a fully probabilistic model is that more sophisticated and principled methods can be used to resolve ambiguities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Spectral and coherence methodologies are ubiquitous for the analysis of multiple time series. Partial coherence analysis may be used to try to determine graphical models for brain functional connectivity. The outcome of such an analysis may be considerably influenced by factors such as the degree of spectral smoothing, line and interference removal, matrix inversion stabilization and the suppression of effects caused by side-lobe leakage, the combination of results from different epochs and people, and multiple hypothesis testing. This paper examines each of these steps in turn and provides a possible path which produces relatively ‘clean’ connectivity plots. In particular we show how spectral matrix diagonal up-weighting can simultaneously stabilize spectral matrix inversion and reduce effects caused by side-lobe leakage, and use the stepdown multiple hypothesis test procedure to help formulate an interaction strength.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Machine breakdowns are one of the main sources of disruption and throughput fluctuation in highly automated production facilities. One element in reducing this disruption is ensuring that the maintenance team responds correctly to machine failures. It is, however, difficult to determine the current practice employed by the maintenance team, let alone suggest improvements to it. 'Knowledge based improvement' is a methodology that aims to address this issue, by (a) eliciting knowledge on current practice, (b) evaluating that practice and (c) looking for improvements. The methodology, based on visual interactive simulation and artificial intelligence methods, and its application to a Ford engine assembly facility are described. Copyright © 2002 Society of Automotive Engineers, Inc.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Simulation modelling has been used for many years in the manufacturing sector but has now become a mainstream tool in business situations. This is partly because of the popularity of business process re-engineering (BPR) and other process based improvement methods that use simulation to help analyse changes in process design. This textbook includes case studies in both manufacturing and service situations to demonstrate the usefulness of the approach. A further reason for the increasing popularity of the technique is the development of business orientated and user-friendly Windows-based software. This text provides a guide to the use of ARENA, SIMUL8 and WITNESS simulation software systems that are widely used in industry and available to students. Overall this text provides a practical guide to building and implementing the results from a simulation model. All the steps in a typical simulation study are covered including data collection, input data modelling and experimentation.