Biblioteca Digital

916 resultados para Multilevel Linear Models

Estimating conditional volatility with neural networks

Relevância:

80.00% 80.00%

Publicador:

Resumo:

It is well known that one of the obstacles to effective forecasting of exchange rates is heteroscedasticity (non-stationary conditional variance). The autoregressive conditional heteroscedastic (ARCH) model and its variants have been used to estimate a time dependent variance for many financial time series. However, such models are essentially linear in form and we can ask whether a non-linear model for variance can improve results just as non-linear models (such as neural networks) for the mean have done. In this paper we consider two neural network models for variance estimation. Mixture Density Networks (Bishop 1994, Nix and Weigend 1994) combine a Multi-Layer Perceptron (MLP) and a mixture model to estimate the conditional data density. They are trained using a maximum likelihood approach. However, it is known that maximum likelihood estimates are biased and lead to a systematic under-estimate of variance. More recently, a Bayesian approach to parameter estimation has been developed (Bishop and Qazaz 1996) that shows promise in removing the maximum likelihood bias. However, up to now, this model has not been used for time series prediction. Here we compare these algorithms with two other models to provide benchmark results: a linear model (from the ARIMA family), and a conventional neural network trained with a sum-of-squares error function (which estimates the conditional mean of the time series with a constant variance noise model). This comparison is carried out on daily exchange rate data for five currencies.

Developments of the generative topographic mapping

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The generative topographic mapping (GTM) model was introduced by Bishop et al. (1998, Neural Comput. 10(1), 215-234) as a probabilistic re- formulation of the self-organizing map (SOM). It offers a number of advantages compared with the standard SOM, and has already been used in a variety of applications. In this paper we report on several extensions of the GTM, including an incremental version of the EM algorithm for estimating the model parameters, the use of local subspace models, extensions to mixed discrete and continuous data, semi-linear models which permit the use of high-dimensional manifolds whilst avoiding computational intractability, Bayesian inference applied to hyper-parameters, and an alternative framework for the GTM based on Gaussian processes. All of these developments directly exploit the probabilistic structure of the GTM, thereby allowing the underlying modelling assumptions to be made explicit. They also highlight the advantages of adopting a consistent probabilistic framework for the formulation of pattern recognition algorithms.

Efficient training of RBF networks for classification

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Radial Basis Function networks with linear outputs are often used in regression problems because they can be substantially faster to train than Multi-layer Perceptrons. For classification problems, the use of linear outputs is less appropriate as the outputs are not guaranteed to represent probabilities. In this paper we show how RBFs with logistic and softmax outputs can be trained efficiently using algorithms derived from Generalised Linear Models. This approach is compared with standard non-linear optimisation algorithms on a number of datasets.

DVMS 1.5: A user manual (the data visualization and modeling system)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The data available during the drug discovery process is vast in amount and diverse in nature. To gain useful information from such data, an effective visualisation tool is required. To provide better visualisation facilities to the domain experts (screening scientist, biologist, chemist, etc.),we developed a software which is based on recently developed principled visualisation algorithms such as Generative Topographic Mapping (GTM) and Hierarchical Generative Topographic Mapping (HGTM). The software also supports conventional visualisation techniques such as Principal Component Analysis, NeuroScale, PhiVis, and Locally Linear Embedding (LLE). The software also provides global and local regression facilities . It supports regression algorithms such as Multilayer Perceptron (MLP), Radial Basis Functions network (RBF), Generalised Linear Models (GLM), Mixture of Experts (MoE), and newly developed Guided Mixture of Experts (GME). This user manual gives an overview of the purpose of the software tool, highlights some of the issues to be taken care while creating a new model, and provides information about how to install & use the tool. The user manual does not require the readers to have familiarity with the algorithms it implements. Basic computing skills are enough to operate the software.

Bayesian data assimilation

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis addresses data assimilation, which typically refers to the estimation of the state of a physical system given a model and observations, and its application to short-term precipitation forecasting. A general introduction to data assimilation is given, both from a deterministic and' stochastic point of view. Data assimilation algorithms are reviewed, in the static case (when no dynamics are involved), then in the dynamic case. A double experiment on two non-linear models, the Lorenz 63 and the Lorenz 96 models, is run and the comparative performance of the methods is discussed in terms of quality of the assimilation, robustness "in the non-linear regime and computational time. Following the general review and analysis, data assimilation is discussed in the particular context of very short-term rainfall forecasting (nowcasting) using radar images. An extended Bayesian precipitation nowcasting model is introduced. The model is stochastic in nature and relies on the spatial decomposition of the rainfall field into rain "cells". Radar observations are assimilated using a Variational Bayesian method in which the true posterior distribution of the parameters is approximated by a more tractable distribution. The motion of the cells is captured by a 20 Gaussian process. The model is tested on two precipitation events, the first dominated by convective showers, the second by precipitation fronts. Several deterministic and probabilistic validation methods are applied and the model is shown to retain reasonable prediction skill at up to 3 hours lead time. Extensions to the model are discussed.

Practical methods of tracking of nonstationary time series applied to real-world data

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we discuss some practical implications for implementing adaptable network algorithms applied to non-stationary time series problems. Two real world data sets, containing electricity load demands and foreign exchange market prices, are used to test several different methods, ranging from linear models with fixed parameters, to non-linear models which adapt both parameters and model order on-line. Training with the extended Kalman filter, we demonstrate that the dynamic model-order increment procedure of the resource allocating RBF network (RAN) is highly sensitive to the parameters of the novelty criterion. We investigate the use of system noise for increasing the plasticity of the Kalman filter training algorithm, and discuss the consequences for on-line model order selection. The results of our experiments show that there are advantages to be gained in tracking real world non-stationary data through the use of more complex adaptive models.

Probabilistic multiple model neural network based leak detection system:experimental study

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents an effective decision making system for leak detection based on multiple generalized linear models and clustering techniques. The training data for the proposed decision system is obtained by setting up an experimental pipeline fully operational distribution system. The system is also equipped with data logging for three variables; namely, inlet pressure, outlet pressure, and outlet flow. The experimental setup is designed such that multi-operational conditions of the distribution system, including multi pressure and multi flow can be obtained. We then statistically tested and showed that pressure and flow variables can be used as signature of leak under the designed multi-operational conditions. It is then shown that the detection of leakages based on the training and testing of the proposed multi model decision system with pre data clustering, under multi operational conditions produces better recognition rates in comparison to the training based on the single model approach. This decision system is then equipped with the estimation of confidence limits and a method is proposed for using these confidence limits for obtaining more robust leakage recognition results.

Parameter Identification of a Fed-Batch Cultivation of S. Cerevisiae using Genetic Algorithms

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fermentation processes as objects of modelling and high-quality control are characterized with interdependence and time-varying of process variables that lead to non-linear models with a very complex structure. This is why the conventional optimization methods cannot lead to a satisfied solution. As an alternative, genetic algorithms, like the stochastic global optimization method, can be applied to overcome these limitations. The application of genetic algorithms is a precondition for robustness and reaching of a global minimum that makes them eligible and more workable for parameter identification of fermentation models. Different types of genetic algorithms, namely simple, modified and multi-population ones, have been applied and compared for estimation of nonlinear dynamic model parameters of fed-batch cultivation of S. cerevisiae.

Detecting Precipitation Climate Changes: An Approach Based on a Stochastic Daily Precipitation Model

Relevância:

80.00% 80.00%

Publicador:

Resumo:

2002 Mathematics Subject Classification: 62M10.

About the Oral Health of Bulgarian Population over 20 Years Old

Relevância:

80.00% 80.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62P10, 62J12.

Statistical Inference for Processes Depending on Environments and Application in Regenerative Processes

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We build the Conditional Least Squares Estimator of 0 based on the observation of a single trajectory of {Zk,Ck}k, and give conditions ensuring its strong consistency. The particular case of general linear models according to 0=( 0, 0) and among them, regenerative processes, are studied more particularly. In this frame, we may also prove the consistency of the estimator of 0 although it belongs to an asymptotic negligible part of the model, and the asymptotic law of the estimator may also be calculated.

The use of Poisson regression in the sociological study of suicide

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper explains how Poisson regression can be used in studies in which the dependent variable describes the number of occurrences of some rare event such as suicide. After pointing out why ordinary linear regression is inappropriate for treating dependent variables of this sort, we go on to present the basic Poisson regression model and show how it fits in the broad class of generalized linear models. Then we turn to discussing a major problem of Poisson regression known as overdispersion and suggest possible solutions, including the correction of standard errors and negative binomial regression. The paper ends with a detailed empirical example, drawn from our own research on suicide.

Factors associated with prompt difficulty in automated essay scoring

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This study explores factors related to the prompt difficulty in Automated Essay Scoring. The sample was composed of 6,924 students. For each student, there were 1-4 essays, across 20 different writing prompts, for a total of 20,243 essays. E-rater® v.2 essay scoring engine developed by the Educational Testing Service was used to score the essays. The scoring engine employs a statistical model that incorporates 10 predictors associated with writing characteristics of which 8 were used. The Rasch partial credit analysis was applied to the scores to determine the difficulty levels of prompts. In addition, the scores were used as outcomes in the series of hierarchical linear models (HLM) in which students and prompts constituted the cross-classification levels. This methodology was used to explore the partitioning of the essay score variance.^ The results indicated significant differences in prompt difficulty levels due to genre. Descriptive prompts, as a group, were found to be more difficult than the persuasive prompts. In addition, the essay score variance was partitioned between students and prompts. The amount of the essay score variance that lies between prompts was found to be relatively small (4 to 7 percent). When the essay-level, student-level-and prompt-level predictors were included in the model, it was able to explain almost all variance that lies between prompts. Since in most high-stakes writing assessments only 1-2 prompts per students are used, the essay score variance that lies between prompts represents an undesirable or "noise" variation. Identifying factors associated with this "noise" variance may prove to be important for prompt writing and for constructing Automated Essay Scoring mechanisms for weighting prompt difficulty when assigning essay score.^

Periphyton light transmission relationships in Florida Bay and the Florida Keys, USA

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Light transmission was measured through intact, submerged periphyton communities on artificial seagrass leaves. The periphyton communities were representative of the communities on Thalassia testudinum in subtropical seagrass meadows. The periphyton communities sampled were adhered carbonate sediment, coralline algae, and mixed algal assemblages. Crustose or film-forming periphyton assemblages were best prepared for light transmission measurements using artificial leaves fouled on both sides, while measurements through three-dimensional filamentous algae required the periphyton to be removed from one side. For one-sided samples, light transmission could be measured as the difference between fouled and reference artificial leaf samples. For two-sided samples, the percent periphyton light transmission to the leaf surface was calculated as the square root of the fraction of incident light. Linear, exponential, and hyperbolic equations were evaluated as descriptors of the periphyton dry weight versus light transmission relationship. Hyperbolic and exponential decay models were superior to linear models and exhibited the best fits for the observed relationships. Differences between the coefficients of determination (r2) of hyperbolic and exponential decay models were statistically insignificant. Constraining these models for 100% light transmission at zero periphyton load did not result in any statistically significant loss in the explanatory capability of the models. In most all cases, increasing model complexity using three-parameter models rather than two-parameter models did not significantly increase the amount of variation explained. Constrained two-parameter hyperbolic or exponential decay models were judged best for describing the periphyton dry weight versus light transmission relationship. On T. testudinum in Florida Bay and the Florida Keys, significant differences were not observed in the light transmission characteristics of the varying periphyton communities at different study sites. Using pooled data from the study sites, the hyperbolic decay coefficient for periphyton light transmission was estimated to be 4.36 mg dry wt. cm−2. For exponential models, the exponential decay coefficient was estimated to be 0.16 cm2 mg dry wt.−1.

The association of depression and perceived stress with beta cell function between African and Haitian Americans with and without type 2 diabetes

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Diabetes and diabetes-related complications are major causes of morbidity and mortality in the United States. Depressive symptoms and perceived stress have been identified as possible risk factors for beta cell dysfunction and diabetes. The purpose of this study was to assess associations between depression symptoms and perceived stress with beta cell function between African and Haitian Americans with and without type 2 diabetes. Participants and Methods: Informed consent and data were available for 462 participants (231 African Americans and 231 Haitian Americans) for this cross-sectional study. A demographic questionnaire developed by the Primary Investigator was used to collect information regarding age, gender, smoking, and ethnicity. Diabetes status was determined by self-report and confirmed by fasting blood glucose. Anthropometrics (weight, and height and waist circumference) and vital signs (blood pressure) were taken. Blood samples were drawn after 8 10 hours over-night fasting to measure lipid panel, fasting plasma glucose and serum insulin concentrations. The homeostatic model assessment, version 2 (HOMA2) computer model was used to calculate beta cell function. Depression was assessed using the Beck Depression Inventory-II (BDI-II) and stress levels were assessed using the Perceived Stress Scale (PSS). Results: Moderate to severe depressive symptoms were more likely for persons with diabetes (p = 0.030). There were no differences in perceived stress between ethnicity and diabetes status (p = 0.283). General linear models for participants with and without type 2 diabetes using beta cell function as the dependent variable showed no association with depressive symptoms and perceived stress; however, Haitian Americans had significantly lower beta cell function than African Americans both with and without diabetes and adjusting for age, gender, waist circumference and smoking. Further research is needed to compare these risk factors in other race/ethnic groups.

«
1
2
...
36
37
38
39
40
41
42
...
61
62
»