937 resultados para multiple linear regression models
Resumo:
The majority of past and current individual-tree growth modelling methodologies have failed to characterise and incorporate structured stochastic components. Rather, they have relied on deterministic predictions or have added an unstructured random component to predictions. In particular, spatial stochastic structure has been neglected, despite being present in most applications of individual-tree growth models. Spatial stochastic structure (also called spatial dependence or spatial autocorrelation) eventuates when spatial influences such as competition and micro-site effects are not fully captured in models. Temporal stochastic structure (also called temporal dependence or temporal autocorrelation) eventuates when a sequence of measurements is taken on an individual-tree over time, and variables explaining temporal variation in these measurements are not included in the model. Nested stochastic structure eventuates when measurements are combined across sampling units and differences among the sampling units are not fully captured in the model. This review examines spatial, temporal, and nested stochastic structure and instances where each has been characterised in the forest biometry and statistical literature. Methodologies for incorporating stochastic structure in growth model estimation and prediction are described. Benefits from incorporation of stochastic structure include valid statistical inference, improved estimation efficiency, and more realistic and theoretically sound predictions. It is proposed in this review that individual-tree modelling methodologies need to characterise and include structured stochasticity. Possibilities for future research are discussed. (C) 2001 Elsevier Science B.V. All rights reserved.
Resumo:
The objective of this study was to estimate (co)variance functions using random regression models on Legendre polynomials for the analysis of repeated measures of BW from birth to adult age. A total of 82,064 records from 8,145 females were analyzed. Different models were compared. The models included additive direct and maternal effects, and animal and maternal permanent environmental effects as random terms. Contemporary group and dam age at calving (linear and quadratic effect) were included as fixed effects, and orthogonal Legendre polynomials of animal age (cubic regression) were considered as random co-variables. Eight models with polynomials of third to sixth order were used to describe additive direct and maternal effects, and animal and maternal permanent environmental effects. Residual effects were modeled using 1 (i.e., assuming homogeneity of variances across all ages) or 5 age classes. The model with 5 classes was the best to describe the trajectory of residuals along the growth curve. The model including fourth- and sixth-order polynomials for additive direct and animal permanent environmental effects, respectively, and third-order polynomials for maternal genetic and maternal permanent environmental effects were the best. Estimates of (co) variance obtained with the multi-trait and random regression models were similar. Direct heritability estimates obtained with the random regression models followed a trend similar to that obtained with the multi-trait model. The largest estimates of maternal heritability were those of BW taken close to 240 d of age. In general, estimates of correlation between BW from birth to 8 yr of age decreased with increasing distance between ages.
Resumo:
Despite their limitations, linear filter models continue to be used to simulate the receptive field properties of cortical simple cells. For theoreticians interested in large scale models of visual cortex, a family of self-similar filters represents a convenient way in which to characterise simple cells in one basic model. This paper reviews research on the suitability of such models, and goes on to advance biologically motivated reasons for adopting a particular group of models in preference to all others. In particular, the paper describes why the Gabor model, so often used in network simulations, should be dropped in favour of a Cauchy model, both on the grounds of frequency response and mutual filter orthogonality.
Resumo:
This paper proposes a template for modelling complex datasets that integrates traditional statistical modelling approaches with more recent advances in statistics and modelling through an exploratory framework. Our approach builds on the well-known and long standing traditional idea of 'good practice in statistics' by establishing a comprehensive framework for modelling that focuses on exploration, prediction, interpretation and reliability assessment, a relatively new idea that allows individual assessment of predictions. The integrated framework we present comprises two stages. The first involves the use of exploratory methods to help visually understand the data and identify a parsimonious set of explanatory variables. The second encompasses a two step modelling process, where the use of non-parametric methods such as decision trees and generalized additive models are promoted to identify important variables and their modelling relationship with the response before a final predictive model is considered. We focus on fitting the predictive model using parametric, non-parametric and Bayesian approaches. This paper is motivated by a medical problem where interest focuses on developing a risk stratification system for morbidity of 1,710 cardiac patients given a suite of demographic, clinical and preoperative variables. Although the methods we use are applied specifically to this case study, these methods can be applied across any field, irrespective of the type of response.
Resumo:
Objetivo: o otimismo tem sido demonstrado como uma variável importante no ajustamento da qualidade de vida de pessoas com doenças crônicas. O estudo tem como objetivo verificar se o otimismo exerce um efeito moderador ou mediador entre os traços de personalidade e a qualidade de vida, em portugueses com doenças crônicas. Métodos: os modelos de regressão linear múltipla foram usados para avaliar o efeito de moderação e mediação do otimismo na qualidade de vida. A amostra, constituída por 729 doentes, recrutados nos principais hospitais de Portugal responderam a questionários de autorresposta avaliando questões sócio-demográficas e clínicas, personalidade, otimismo disposicional, qualidade de vida e bem-estar subjetivo. Resultados: os resultados encontrados mostraram que o otimismo disposicional não exerce um papel moderador entre os traços de personalidade e a qualidade de vida. Controlando por idade, sexo, nível de escolaridade e percepção da severidade da doença, o efeito dos traços de personalidade na qualidade de vida e no bem-estar subjetivo foi mediado pelo otimismo (parcial e total), expecto para as associações, neuroticismo/abertura à experiência e à saúde física. Conclusão: o otimismo disposicional exerce apenas um papel mediador entre os traços de personalidade e qualidade de vida, em pessoas com doenças crônicas, sugerindo que 'a expectativa de que coisas boas vão acontecer' contribui para uma melhor qualidade de vida e melhor bem-estar subjetivo.
Resumo:
OBJECTIVE: To examine the association between tooth loss and general and central obesity among adults. METHODS: Population-based cross-sectional study with 1,720 adults aged 20 to 59 years from Florianópolis, Southern Brazil. Home interviews were performed and anthropometric measures were taken. Information on sociodemographic data, self-reported diabetes, self-reported number of teeth, central obesity (waist circumference [WC] > 88 cm in women and > 102 cm in men) and general obesity (body mass index [BMI] ≥ 30 kg/m²) was collected. We used multivariable Poisson regression models to assess the association between general and central obesity and tooth loss after controlling for confounders. We also performed simple and multiple linear regressions by using BMI and WC as continuous variables. Interaction between age and tooth loss was also assessed. RESULTS: The mean BMI was 25.9 kg/m² (95%CI 25.6;26.2) in men and 25.4 kg/m2 (95%CI 25.0;25.7) in women. The mean WC was 79.3 cm (95%CI 78.4;80.1) in men and 88.4 cm (95%CI 87.6;89.2) in women. A positive association was found between the presence of less than 10 teeth in at least one arch and increased mean BMI and WC after adjusting for education level, self-reported diabetes, gender and monthly per capita income. However, this association was lost when the variable age was included in the model. The prevalence of general obesity was 50% higher in those with less than 10 teeth in at least one arch when compared with those with 10 or more teeth in both arches after adjusting for education level, self-reported diabetes and monthly per capita family income. However, the statistical significance was lost after controlling for age. CONCLUSIONS: Obesity was associated with number of teeth, though it depended on the participants' age groups.
Resumo:
The GxE interaction only became widely discussed from evolutionary studies and evaluations of the causes of behavioral changes of species cultivated in environments. In the last 60 years, several methodologies for the study of adaptability and stability of genotypes in multiple environments trials were developed in order to assist the breeder's choice regarding which genotypes are more stable and which are the most suitable for the crops in the most diverse environments. The methods that use linear regression analysis were the first to be used in a general way by breeders, followed by multivariate analysis methods and mixed models. The need to identify the genetic and environmental causes that are behind the GxE interaction led to the development of new models that include the use of covariates and which can also include both multivariate methods and mixed modeling. However, further studies are needed to identify the causes of GxE interaction as well as for the more accurate measurement of its effects on phenotypic expression of varieties in competition trials carried out in genetic breeding programs.
Resumo:
PURPOSE. To evaluate potential risk factors for the development of multiple sclerosis in Brazilian patients. METHOD. A case control study was carried out in 81 patients enrolled at the Department of Neurology of the Hospital da Lagoa in Rio de Janeiro, and 81 paired controls. A standardized questionnaire on demographic, social and cultural variables, and medical and family history was used. Statistical analysis was performed using descriptive statistics and conditional logistic regression models with the SPSS for Windows software program. RESULTS. Having standard vaccinations (vaccinations specified by the Brazilian government) (OR=16.2; 95% CI=2.3-115.2), smoking (OR=7.6; 95% CI=2.1-28.2), being single (OR=4.7; 95% CI=1.4-15.6) and eating animal brain (OR=3.4; 95% CI=1.2-9.8) increased the risk of developing MS. CONCLUSIONS. RESULTS of this study may contribute towards better awareness of the epidemiological characteristics of Brazilian patients with multiple sclerosis.
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. Recent advances in machine learning offer a novel approach to model spatial distribution of petrophysical properties in complex reservoirs alternative to geostatistics. The approach is based of semisupervised learning, which handles both ?labelled? observed data and ?unlabelled? data, which have no measured value but describe prior knowledge and other relevant data in forms of manifolds in the input space where the modelled property is continuous. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic geological features and describe stochastic variability and non-uniqueness of spatial properties. On the other hand, it is able to capture and preserve key spatial dependencies such as connectivity of high permeability geo-bodies, which is often difficult in contemporary petroleum reservoir studies. Semi-supervised SVR as a data driven algorithm is designed to integrate various kind of conditioning information and learn dependences from it. The semi-supervised SVR model is able to balance signal/noise levels and control the prior belief in available data. In this work, stochastic semi-supervised SVR geomodel is integrated into Bayesian framework to quantify uncertainty of reservoir production with multiple models fitted to past dynamic observations (production history). Multiple history matched models are obtained using stochastic sampling and/or MCMC-based inference algorithms, which evaluate posterior probability distribution. Uncertainty of the model is described by posterior probability of the model parameters that represent key geological properties: spatial correlation size, continuity strength, smoothness/variability of spatial property distribution. The developed approach is illustrated with a fluvial reservoir case. The resulting probabilistic production forecasts are described by uncertainty envelopes. The paper compares the performance of the models with different combinations of unknown parameters and discusses sensitivity issues.
Resumo:
Background Multiple logistic regression is precluded from many practical applications in ecology that aim to predict the geographic distributions of species because it requires absence data, which are rarely available or are unreliable. In order to use multiple logistic regression, many studies have simulated "pseudo-absences" through a number of strategies, but it is unknown how the choice of strategy influences models and their geographic predictions of species. In this paper we evaluate the effect of several prevailing pseudo-absence strategies on the predictions of the geographic distribution of a virtual species whose "true" distribution and relationship to three environmental predictors was predefined. We evaluated the effect of using a) real absences b) pseudo-absences selected randomly from the background and c) two-step approaches: pseudo-absences selected from low suitability areas predicted by either Ecological Niche Factor Analysis: (ENFA) or BIOCLIM. We compared how the choice of pseudo-absence strategy affected model fit, predictive power, and information-theoretic model selection results. Results Models built with true absences had the best predictive power, best discriminatory power, and the "true" model (the one that contained the correct predictors) was supported by the data according to AIC, as expected. Models based on random pseudo-absences had among the lowest fit, but yielded the second highest AUC value (0.97), and the "true" model was also supported by the data. Models based on two-step approaches had intermediate fit, the lowest predictive power, and the "true" model was not supported by the data. Conclusion If ecologists wish to build parsimonious GLM models that will allow them to make robust predictions, a reasonable approach is to use a large number of randomly selected pseudo-absences, and perform model selection based on an information theoretic approach. However, the resulting models can be expected to have limited fit.
Resumo:
X-ray is a technology that is used for numerous applications in the medical field. The process of X-ray projection gives a 2-dimension (2D) grey-level texture from a 3- dimension (3D) object. Until now no clear demonstration or correlation has positioned the 2D texture analysis as a valid indirect evaluation of the 3D microarchitecture. TBS is a new texture parameter based on the measure of the experimental variogram. TBS evaluates the variation between 2D image grey-levels. The aim of this study was to evaluate existing correlations between 3D bone microarchitecture parameters - evaluated from μCT reconstructions - and the TBS value, calculated on 2D projected images. 30 dried human cadaveric vertebrae were acquired on a micro-scanner (eXplorer Locus, GE) at isotropic resolution of 93 μm. 3D vertebral body models were used. The following 3D microarchitecture parameters were used: Bone volume fraction (BV/TV), Trabecular thickness (TbTh), trabecular space (TbSp), trabecular number (TbN) and connectivity density (ConnD). 3D/2D projections has been done by taking into account the Beer-Lambert Law at X-ray energy of 50, 100, 150 KeV. TBS was assessed on 2D projected images. Correlations between TBS and the 3D microarchitecture parameters were evaluated using a linear regression analysis. Paired T-test is used to assess the X-ray energy effects on TBS. Multiple linear regressions (backward) were used to evaluate relationships between TBS and 3D microarchitecture parameters using a bootstrap process. BV/TV of the sample ranged from 18.5 to 37.6% with an average value at 28.8%. Correlations' analysis showedthat TBSwere strongly correlatedwith ConnD(0.856≤r≤0.862; p<0.001),with TbN (0.805≤r≤0.810; p<0.001) and negatively with TbSp (−0.714≤r≤−0.726; p<0.001), regardless X-ray energy. Results show that lower TBS values are related to "degraded" microarchitecture, with low ConnD, low TbN and a high TbSp. The opposite is also true. X-ray energy has no effect onTBS neither on the correlations betweenTBS and the 3Dmicroarchitecture parameters. In this study, we demonstrated that TBS was significantly correlated with 3D microarchitecture parameters ConnD and TbN, and negatively with TbSp, no matter what X-ray energy has been used. This article is part of a Special Issue entitled ECTS 2011. Disclosure of interest: None declared.
Resumo:
In CoDaWork’05, we presented an application of discriminant function analysis (DFA) to 4 differentcompositional datasets and modelled the first canonical variable using a segmented regression modelsolely based on an observation about the scatter plots. In this paper, multiple linear regressions areapplied to different datasets to confirm the validity of our proposed model. In addition to dating theunknown tephras by calibration as discussed previously, another method of mapping the unknown tephrasinto samples of the reference set or missing samples in between consecutive reference samples isproposed. The application of these methodologies is demonstrated with both simulated and real datasets.This new proposed methodology provides an alternative, more acceptable approach for geologists as theirfocus is on mapping the unknown tephra with relevant eruptive events rather than estimating the age ofunknown tephra.Kew words: Tephrochronology; Segmented regression
Resumo:
Random coefficient regression models have been applied in differentfields and they constitute a unifying setup for many statisticalproblems. The nonparametric study of this model started with Beranand Hall (1992) and it has become a fruitful framework. In thispaper we propose and study statistics for testing a basic hypothesisconcerning this model: the constancy of coefficients. The asymptoticbehavior of the statistics is investigated and bootstrapapproximations are used in order to determine the critical values ofthe test statistics. A simulation study illustrates the performanceof the proposals.
Resumo:
Epstein-Barr virus (EBV) has been associated with multiple sclerosis (MS), however, most studies examining the relationship between the virus and the disease have been based on serologies, and if EBV is linked to MS, CD8+ T cells are likely to be involved as they are important both in MS pathogenesis and in controlling viruses. We hypothesized that valuable information on the link between MS and EBV would be ascertained from the study of frequency and activation levels of EBV-specific CD8+ T cells in different categories of MS patients and control subjects. We investigated EBV-specific cellular immune responses using proliferation and enzyme linked immunospot assays, and humoral immune responses by analysis of anti-EBV antibodies, in a cohort of 164 subjects, including 108 patients with different stages of MS, 35 with other neurological diseases and 21 healthy control subjects. Additionally, the cohort were all tested against cytomegalovirus (CMV), another neurotropic herpes virus not convincingly associated with MS, nor thought to be deleterious to the disease. We corrected all data for age using linear regression analysis over the total cohorts of EBV- and CMV-infected subjects. In the whole cohort, the rate of EBV and CMV infections were 99% and 51%, respectively. The frequency of IFN-gamma secreting EBV-specific CD8+ T cells in patients with clinically isolated syndrome (CIS) was significantly higher than that found in patients with relapsing-remitting MS (RR-MS), secondary-progressive MS, primary-progressive MS, patients with other neurological diseases and healthy controls. The shorter the interval between MS onset and our assays, the more intense was the EBV-specific CD8+ T-cell response. Confirming the above results, we found that EBV-specific CD8+ T-cell responses decreased in 12/13 patients with CIS followed prospectively for 1.0 +/- 0.2 years. In contrast, there was no difference between categories for EBV-specific CD4+ T cell, or for CMV-specific CD4+ and CD8+ T-cell responses. Anti-EBV-encoded nuclear antigen-1 (EBNA-1)-specific antibodies correlated with EBV-specific CD8+ T cells in patients with CIS and RR-MS. However, whereas EBV-specific CD8+ T cells were increased the most in early MS, EBNA-1-specific antibodies were increased in early as well as in progressive forms of MS. Our data show high levels of CD8+ T-cell activation against EBV--but not CMV--early in the course of MS, which support the hypothesis that EBV might be associated with the onset of this disease.
Resumo:
Radioactive soil-contamination mapping and risk assessment is a vital issue for decision makers. Traditional approaches for mapping the spatial concentration of radionuclides employ various regression-based models, which usually provide a single-value prediction realization accompanied (in some cases) by estimation error. Such approaches do not provide the capability for rigorous uncertainty quantification or probabilistic mapping. Machine learning is a recent and fast-developing approach based on learning patterns and information from data. Artificial neural networks for prediction mapping have been especially powerful in combination with spatial statistics. A data-driven approach provides the opportunity to integrate additional relevant information about spatial phenomena into a prediction model for more accurate spatial estimates and associated uncertainty. Machine-learning algorithms can also be used for a wider spectrum of problems than before: classification, probability density estimation, and so forth. Stochastic simulations are used to model spatial variability and uncertainty. Unlike regression models, they provide multiple realizations of a particular spatial pattern that allow uncertainty and risk quantification. This paper reviews the most recent methods of spatial data analysis, prediction, and risk mapping, based on machine learning and stochastic simulations in comparison with more traditional regression models. The radioactive fallout from the Chernobyl Nuclear Power Plant accident is used to illustrate the application of the models for prediction and classification problems. This fallout is a unique case study that provides the challenging task of analyzing huge amounts of data ('hard' direct measurements, as well as supplementary information and expert estimates) and solving particular decision-oriented problems.