883 resultados para Fractional regression models
Resumo:
We consider rank-based regression models for repeated measures. To account for possible withinsubject correlations, we decompose the total ranks into between- and within-subject ranks and obtain two different estimators based on between- and within-subject ranks. A simple perturbation method is then introduced to generate bootstrap replicates of the estimating functions and the parameter estimates. This provides a convenient way for combining the corresponding two types of estimating function for more efficient estimation.
Resumo:
Partial least squares regression models on NIR spectra are often optimised (for wavelength range, mathematical pretreatment and outlier elimination) in terms of calibration terms of validation performance with reference to totally independent populations.
Resumo:
This thesis report attempts to improve the models for predicting forest stand structure for practical use, e.g. forest management planning (FMP) purposes in Finland. Comparisons were made between Weibull and Johnson s SB distribution and alternative regression estimation methods. Data used for preliminary studies was local but the final models were based on representative data. Models were validated mainly in terms of bias and RMSE in the main stand characteristics (e.g. volume) using independent data. The bivariate SBB distribution model was used to mimic realistic variations in tree dimensions by including within-diameter-class height variation. Using the traditional method, diameter distribution with the expected height resulted in reduced height variation, whereas the alternative bivariate method utilized the error-term of the height model. The lack of models for FMP was covered to some extent by the models for peatland and juvenile stands. The validation of these models showed that the more sophisticated regression estimation methods provided slightly improved accuracy. A flexible prediction and application for stand structure consisted of seemingly unrelated regression models for eight stand characteristics, the parameters of three optional distributions and Näslund s height curve. The cross-model covariance structure was used for linear prediction application, in which the expected values of the models were calibrated with the known stand characteristics. This provided a framework to validate the optional distributions and the optional set of stand characteristics. Height distribution is recommended for the earliest state of stands because of its continuous feature. From the mean height of about 4 m, Weibull dbh-frequency distribution is recommended in young stands if the input variables consist of arithmetic stand characteristics. In advanced stands, basal area-dbh distribution models are recommended. Näslund s height curve proved useful. Some efficient transformations of stand characteristics are introduced, e.g. the shape index, which combined the basal area, the stem number and the median diameter. Shape index enabled SB model for peatland stands to detect large variation in stand densities. This model also demonstrated reasonable behaviour for stands in mineral soils.
Resumo:
The soil moisture characteristic (SMC) forms an important input to mathematical models of water and solute transport in the unsaturated-soil zone. Owing to their simplicity and ease of use, texture-based regression models are commonly used to estimate the SMC from basic soil properties. In this study, the performances of six such regression models were evaluated on three soils. Moisture characteristics generated by the regression models were statistically compared with the characteristics developed independently from laboratory and in-situ retention data of the soil profiles. Results of the statistical performance evaluation, while providing useful information on the errors involved in estimating the SMC, also highlighted the importance of the nature of the data set underlying the regression models. Among the models evaluated, the one possessing an underlying data set of in-situ measurements was found to be the best estimator of the in-situ SMC for all the soils. Considerable errors arose when a textural model based on laboratory data was used to estimate the field retention characteristics of unsaturated soils.
Resumo:
158 p.
Resumo:
We develop a convex relaxation of maximum a posteriori estimation of a mixture of regression models. Although our relaxation involves a semidefinite matrix variable, we reformulate the problem to eliminate the need for general semidefinite programming. In particular, we provide two reformulations that admit fast algorithms. The first is a max-min spectral reformulation exploiting quasi-Newton descent. The second is a min-min reformulation consisting of fast alternating steps of closed-form updates. We evaluate the methods against Expectation-Maximization in a real problem of motion segmentation from video data.
Resumo:
The fanning of Chinese mitten crab, a quality aquatic product in China and neighbouring Asian countries, has been developing rapidly in China since last decade. It reached a total yield of 3.4 X 10(5) tonnes in 2002. Due to the successive over-stocking year after year, many lakes in the mid-lower Yangtze Basin, the main farming area, are under deterioration, leading to a reduction of crab yield and quality, and, subsequently, a loss of fanning profits. Aiming at a normal development of crab culture and the sustainable use of lakes, an annual investigation dealing with lake environmental factors in relation to stocked crab populations was carried out at 20 farms in 4 lakes. The results show that the submersed macrophyte biomass (B-Mac) is the key factor affecting annual crab yield (CY). Using the ratio of Secchi depth to mean depth (Z(SD)/Z(M)), an easily measured parameter closely correlated to BMac, as driving variable, 10 regression models of maximal crab yields were generated (r(2) ranging 0.49-0.81). Based on the theory of MSY (Maximum Sustainable Yield), in combination with body-weight (BW) and recapture rate (RR) of adult crabs, a general optimal stocking model was eventually formulated. All models are simple and easy to operate. Comments on their applications and prospects are given in brief. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
In this paper, we introduce the method of leaps and bounds regression which can be used to select variables quickly and obtain the best regression models. These models contain one variable, two variables, three variables and so on. The results obtained by using leaps and bounds regression were compared with those achieved by using stepwise regression to lead to the conclusion that leaps and bounds regression is an effective method.
Resumo:
Virtual metrology (VM) aims to predict metrology values using sensor data from production equipment and physical metrology values of preceding samples. VM is a promising technology for the semiconductor manufacturing industry as it can reduce the frequency of in-line metrology operations and provide supportive information for other operations such as fault detection, predictive maintenance and run-to-run control. The prediction models for VM can be from a large variety of linear and nonlinear regression methods and the selection of a proper regression method for a specific VM problem is not straightforward, especially when the candidate predictor set is of high dimension, correlated and noisy. Using process data from a benchmark semiconductor manufacturing process, this paper evaluates the performance of four typical regression methods for VM: multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), neural networks (NN) and Gaussian process regression (GPR). It is observed that GPR performs the best among the four methods and that, remarkably, the performance of linear regression approaches that of GPR as the subset of selected input variables is increased. The observed competitiveness of high-dimensional linear regression models, which does not hold true in general, is explained in the context of extreme learning machines and functional link neural networks.
Resumo:
As técnicas estatísticas são fundamentais em ciência e a análise de regressão linear é, quiçá, uma das metodologias mais usadas. É bem conhecido da literatura que, sob determinadas condições, a regressão linear é uma ferramenta estatística poderosíssima. Infelizmente, na prática, algumas dessas condições raramente são satisfeitas e os modelos de regressão tornam-se mal-postos, inviabilizando, assim, a aplicação dos tradicionais métodos de estimação. Este trabalho apresenta algumas contribuições para a teoria de máxima entropia na estimação de modelos mal-postos, em particular na estimação de modelos de regressão linear com pequenas amostras, afetados por colinearidade e outliers. A investigação é desenvolvida em três vertentes, nomeadamente na estimação de eficiência técnica com fronteiras de produção condicionadas a estados contingentes, na estimação do parâmetro ridge em regressão ridge e, por último, em novos desenvolvimentos na estimação com máxima entropia. Na estimação de eficiência técnica com fronteiras de produção condicionadas a estados contingentes, o trabalho desenvolvido evidencia um melhor desempenho dos estimadores de máxima entropia em relação ao estimador de máxima verosimilhança. Este bom desempenho é notório em modelos com poucas observações por estado e em modelos com um grande número de estados, os quais são comummente afetados por colinearidade. Espera-se que a utilização de estimadores de máxima entropia contribua para o tão desejado aumento de trabalho empírico com estas fronteiras de produção. Em regressão ridge o maior desafio é a estimação do parâmetro ridge. Embora existam inúmeros procedimentos disponíveis na literatura, a verdade é que não existe nenhum que supere todos os outros. Neste trabalho é proposto um novo estimador do parâmetro ridge, que combina a análise do traço ridge e a estimação com máxima entropia. Os resultados obtidos nos estudos de simulação sugerem que este novo estimador é um dos melhores procedimentos existentes na literatura para a estimação do parâmetro ridge. O estimador de máxima entropia de Leuven é baseado no método dos mínimos quadrados, na entropia de Shannon e em conceitos da eletrodinâmica quântica. Este estimador suplanta a principal crítica apontada ao estimador de máxima entropia generalizada, uma vez que prescinde dos suportes para os parâmetros e erros do modelo de regressão. Neste trabalho são apresentadas novas contribuições para a teoria de máxima entropia na estimação de modelos mal-postos, tendo por base o estimador de máxima entropia de Leuven, a teoria da informação e a regressão robusta. Os estimadores desenvolvidos revelam um bom desempenho em modelos de regressão linear com pequenas amostras, afetados por colinearidade e outliers. Por último, são apresentados alguns códigos computacionais para estimação com máxima entropia, contribuindo, deste modo, para um aumento dos escassos recursos computacionais atualmente disponíveis.
Resumo:
Dissertação de Mestrado, Gestão da Água e da Costa, Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2010
Resumo:
In this paper, we present two Partial Least Squares Regression (PLSR) models for compressive and flexural strength responses of a concrete composite material reinforced with pultrusion wastes. The main objective is to characterize this cost-effective waste management solution for glass fiber reinforced polymer (GFRP) pultrusion wastes and end-of-life products that will lead, thereby, to a more sustainable composite materials industry. The experiments took into account formulations with the incorporation of three different weight contents of GFRP waste materials into polyester based mortars, as sand aggregate and filler replacements, two waste particle size grades and the incorporation of silane adhesion promoter into the polyester resin matrix in order to improve binder aggregates interfaces. The regression models were achieved for these data and two latent variables were identified as suitable, with a 95% confidence level. This technological option, for improving the quality of GFRP filled polymer mortars, is viable thus opening a door to selective recycling of GFRP waste and its use in the production of concrete-polymer based products. However, further and complementary studies will be necessary to confirm the technical and economic viability of the process.
Resumo:
The prediction of the time and the efficiency of the remediation of contaminated soils using soil vapor extraction remain a difficult challenge to the scientific community and consultants. This work reports the development of multiple linear regression and artificial neural network models to predict the remediation time and efficiency of soil vapor extractions performed in soils contaminated separately with benzene, toluene, ethylbenzene, xylene, trichloroethylene, and perchloroethylene. The results demonstrated that the artificial neural network approach presents better performances when compared with multiple linear regression models. The artificial neural network model allowed an accurate prediction of remediation time and efficiency based on only soil and pollutants characteristics, and consequently allowing a simple and quick previous evaluation of the process viability.
Resumo:
This paper characterizes four ‘fractal vegetables’: (i) cauliflower (brassica oleracea var. Botrytis); (ii) broccoli (brassica oleracea var. italica); (iii) round cabbage (brassica oleracea var. capitata) and (iv) Brussels sprout (brassica oleracea var. gemmifera), by means of electrical impedance spectroscopy and fractional calculus tools. Experimental data is approximated using fractional-order models and the corresponding parameters are determined with a genetic algorithm. The Havriliak-Negami five-parameter model fits well into the data, demonstrating that classical formulae can constitute simple and reliable models to characterize biological structures.
Resumo:
Objective: To determine which socio-demographic, exposure, morbidity and symptom variables are associated with health-related quality of life among former and current heavy smokers. Methods: Cross sectional data from 2537 participants were studied. All participants were at ≥2% risk of developing lung cancer within 6 years. Linear and logistic regression models utilizing a multivariable fractional polynomial selection process identified variables associated with health-related quality of life, measured by the EQ-5D. Results: Upstream and downstream associations between smoking cessation and higher health-related quality of life were evident. Significant upstream associations, such as education level and current working status and were explained by the addition of morbidities and symptoms to regression models. Having arthritis, decreased forced expiratory volume in one second, fatigue, poor appetite or dyspnea were most highly and commonly associated with decreased HRQoL. Discussion: Upstream factors such as educational attainment, employment status and smoking cessation should be targeted to prevent decreased health-related quality of life. Practitioners should focus treatment on downstream factors, especially symptoms, to improve health-related quality of life.