11 resultados para random coefficient regression model

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We study the persistence phenomenon in a socio-econo dynamics model using computer simulations at a nite temperature on hypercubic lattices in dimensions up to ve. The model includes a \social" local eld which contains the magnetization at time t. The nearest neighbour quenched interactions are drawn from a binary distribution which is a function of the bond concentration, p. The decay of the persistence probability in the model depends on both the spatial dimension and p. We nd no evidence of \blocking" in this model. We also discuss the implications of our results for possible applications in the social and economic elds. It is suggested that the absence, or otherwise, of blocking could be used as a criterion to decide on the validity of a given model in dierent scenarios.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data fluctuation in multiple measurements of Laser Induced Breakdown Spectroscopy (LIBS) greatly affects the accuracy of quantitative analysis. A new LIBS quantitative analysis method based on the Robust Least Squares Support Vector Machine (RLS-SVM) regression model is proposed. The usual way to enhance the analysis accuracy is to improve the quality and consistency of the emission signal, such as by averaging the spectral signals or spectrum standardization over a number of laser shots. The proposed method focuses more on how to enhance the robustness of the quantitative analysis regression model. The proposed RLS-SVM regression model originates from the Weighted Least Squares Support Vector Machine (WLS-SVM) but has an improved segmented weighting function and residual error calculation according to the statistical distribution of measured spectral data. Through the improved segmented weighting function, the information on the spectral data in the normal distribution will be retained in the regression model while the information on the outliers will be restrained or removed. Copper elemental concentration analysis experiments of 16 certified standard brass samples were carried out. The average value of relative standard deviation obtained from the RLS-SVM model was 3.06% and the root mean square error was 1.537%. The experimental results showed that the proposed method achieved better prediction accuracy and better modeling robustness compared with the quantitative analysis methods based on Partial Least Squares (PLS) regression, standard Support Vector Machine (SVM) and WLS-SVM. It was also demonstrated that the improved weighting function had better comprehensive performance in model robustness and convergence speed, compared with the four known weighting functions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We examine the empirical evidence for an environmental Kuznets curve using a semiparametric smooth coefficient regression model that allows us to incorporate flexibility in the parameter estimates, while maintaining the basic econometric structure that is typically used to estimate the pollution-income relationship. This allows us to assess the sensitivity to parameter heterogeneity of typical parametric models used to estimate the relationship between pollution and income, as well as identify why the results from such models are seldom found to be robust. Our results confirm that the resulting relationship between pollution and income is fragile; we show that the estimated pollution-income relationship depends substantially on the heterogeneity of the slope coefficients and the parameter values at which the relationship is evaluated. Different sets of parameters obtained from the semiparametric model give rise to many different shapes for the pollution-income relationship that are commonly found in the literature.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper analyzes the impact of load factor, facility and generator types on the productivity of Korean electric power plants. In order to capture important differences in the effect of load policy on power output, we use a semiparametric smooth coefficient (SPSC) model that allows us to model heterogeneous performances across power plants and over time by allowing underlying technologies to be heterogeneous. The SPSC model accommodates both continuous and discrete covariates. Various specification tests are conducted to compare performance of the SPSC model. Using a unique generator level panel dataset spanning the period 1995-2006, we find that the impact of load factor, generator and facility types on power generation varies substantially in terms of magnitude and significance across different plant characteristics. The results have strong implication for generation policy in Korea as outlined in this study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper analyzes the impact of load factor, facility and generator types on the productivity of Korean electric power plants. In order to capture important differences in the effect of load policy on power output, we use a semiparametric smooth coefficient (SPSC) model that allows us to model heterogeneous performances across power plants and over time by allowing underlying technologies to be heterogeneous. The SPSC model accommodates both continuous and discrete covariates. Various specification tests are conducted to assess the performance of the SPSC model. Using a unique generator level panel dataset spanning the period 1995-2006, we find that the impact of load factor, generator and facility types on power generation varies substantially in terms of magnitude and significance across different plant characteristics. The results have strong implications for generation policy in Korea as outlined in this study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Solving many scientific problems requires effective regression and/or classification models for large high-dimensional datasets. Experts from these problem domains (e.g. biologists, chemists, financial analysts) have insights into the domain which can be helpful in developing powerful models but they need a modelling framework that helps them to use these insights. Data visualisation is an effective technique for presenting data and requiring feedback from the experts. A single global regression model can rarely capture the full behavioural variability of a huge multi-dimensional dataset. Instead, local regression models, each focused on a separate area of input space, often work better since the behaviour of different areas may vary. Classical local models such as Mixture of Experts segment the input space automatically, which is not always effective and it also lacks involvement of the domain experts to guide a meaningful segmentation of the input space. In this paper we addresses this issue by allowing domain experts to interactively segment the input space using data visualisation. The segmentation output obtained is then further used to develop effective local regression models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

National guidance and clinical guidelines recommended multidisciplinary teams (MDTs) for cancer services in order to bring specialists in relevant disciplines together, ensure clinical decisions are fully informed, and to coordinate care effectively. However, the effectiveness of cancer teams was not previously evaluated systematically. A random sample of 72 breast cancer teams in England was studied (548 members in six core disciplines), stratified by region and caseload. Information about team constitution, processes, effectiveness, clinical performance, and members' mental well-being was gathered using appropriate instruments. Two input variables, team workload (P=0.009) and the proportion of breast care nurses (P=0.003), positively predicted overall clinical performance in multivariate analysis using a two-stage regression model. There were significant correlations between individual team inputs, team composition variables, and clinical performance. Some disciplines consistently perceived their team's effectiveness differently from the mean. Teams with shared leadership of their clinical decision-making were most effective. The mental well-being of team members appeared significantly better than in previous studies of cancer clinicians, the NHS, and the general population. This study established that team composition, working methods, and workloads are related to measures of effectiveness, including the quality of clinical care. © 2003 Cancer Research UK.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present a novel method for emulating a stochastic, or random output, computer model and show its application to a complex rabies model. The method is evaluated both in terms of accuracy and computational efficiency on synthetic data and the rabies model. We address the issue of experimental design and provide empirical evidence on the effectiveness of utilizing replicate model evaluations compared to a space-filling design. We employ the Mahalanobis error measure to validate the heteroscedastic Gaussian process based emulator predictions for both the mean and (co)variance. The emulator allows efficient screening to identify important model inputs and better understanding of the complex behaviour of the rabies model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Solving many scientific problems requires effective regression and/or classification models for large high-dimensional datasets. Experts from these problem domains (e.g. biologists, chemists, financial analysts) have insights into the domain which can be helpful in developing powerful models but they need a modelling framework that helps them to use these insights. Data visualisation is an effective technique for presenting data and requiring feedback from the experts. A single global regression model can rarely capture the full behavioural variability of a huge multi-dimensional dataset. Instead, local regression models, each focused on a separate area of input space, often work better since the behaviour of different areas may vary. Classical local models such as Mixture of Experts segment the input space automatically, which is not always effective and it also lacks involvement of the domain experts to guide a meaningful segmentation of the input space. In this paper we addresses this issue by allowing domain experts to interactively segment the input space using data visualisation. The segmentation output obtained is then further used to develop effective local regression models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics.