Biblioteca Digital

952 resultados para Linear models (Statistics)

Validation procedures in radiological diagnostic models. Neural network and logistic regression

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this paper is to compare the performance of twopredictive radiological models, logistic regression (LR) and neural network (NN), with five different resampling methods. One hundred and sixty-seven patients with proven calvarial lesions as the only known disease were enrolled. Clinical and CT data were used for LR and NN models. Both models were developed with cross validation, leave-one-out and three different bootstrap algorithms. The final results of each model were compared with error rate and the area under receiver operating characteristic curves (Az). The neural network obtained statistically higher Az than LR with cross validation. The remaining resampling validation methods did not reveal statistically significant differences between LR and NN rules. The neural network classifier performs better than the one based on logistic regression. This advantage is well detected by three-fold cross-validation, but remains unnoticed when leave-one-out or bootstrap algorithms are used.

Statistical learning theory for geospatial data. Case study: Aral sea

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).

Zero inflated GLMM applied to barn owl data

Relevância:

30.00% 30.00%

Publicador:

Herramientas estadísticas para el estudio de perfiles de riesgo

Relevância:

30.00% 30.00%

Publicador:

Resumo:

En este documento se ilustra de un modo práctico, el empleo de tres instrumentos que permiten al actuario definir grupos arancelarios y estimar premios de riesgo en el proceso que tasa la clase para el seguro de no vida. El primero es el análisis de segmentación (CHAID y XAID) usado en primer lugar en 1997 por UNESPA en su cartera común de coches. El segundo es un proceso de selección gradual con el modelo de regresión a base de distancia. Y el tercero es un proceso con el modelo conocido y generalizado de regresión linear, que representa la técnica más moderna en la bibliografía actuarial. De estos últimos, si combinamos funciones de eslabón diferentes y distribuciones de error, podemos obtener el aditivo clásico y modelos multiplicativos

Local and global error models to improve uncertainty quantification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In groundwater applications, Monte Carlo methods are employed to model the uncertainty on geological parameters. However, their brute-force application becomes computationally prohibitive for highly detailed geological descriptions, complex physical processes, and a large number of realizations. The Distance Kernel Method (DKM) overcomes this issue by clustering the realizations in a multidimensional space based on the flow responses obtained by means of an approximate (computationally cheaper) model; then, the uncertainty is estimated from the exact responses that are computed only for one representative realization per cluster (the medoid). Usually, DKM is employed to decrease the size of the sample of realizations that are considered to estimate the uncertainty. We propose to use the information from the approximate responses for uncertainty quantification. The subset of exact solutions provided by DKM is then employed to construct an error model and correct the potential bias of the approximate model. Two error models are devised that both employ the difference between approximate and exact medoid solutions, but differ in the way medoid errors are interpolated to correct the whole set of realizations. The Local Error Model rests upon the clustering defined by DKM and can be seen as a natural way to account for intra-cluster variability; the Global Error Model employs a linear interpolation of all medoid errors regardless of the cluster to which the single realization belongs. These error models are evaluated for an idealized pollution problem in which the uncertainty of the breakthrough curve needs to be estimated. For this numerical test case, we demonstrate that the error models improve the uncertainty quantification provided by the DKM algorithm and are effective in correcting the bias of the estimate computed solely from the MsFV results. The framework presented here is not specific to the methods considered and can be applied to other combinations of approximate models and techniques to select a subset of realizations

Impact of incorrect assumptions on the covariance structure of random effects and/or residuals in nonlinear mixed models for repeated measures data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we analyse, using Monte Carlo simulation, the possible consequences of incorrect assumptions on the true structure of the random effects covariance matrix and the true correlation pattern of residuals, over the performance of an estimation method for nonlinear mixed models. The procedure under study is the well known linearization method due to Lindstrom and Bates (1990), implemented in the nlme library of S-Plus and R. Its performance is studied in terms of bias, mean square error (MSE), and true coverage of the associated asymptotic confidence intervals. Ignoring other criteria like the convenience of avoiding over parameterised models, it seems worst to erroneously assume some structure than do not assume any structure when this would be adequate.

Selecting statistical models to study the relationship between soybean yield and soil physical properties

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Statistical models allow the representation of data sets and the estimation and/or prediction of the behavior of a given variable through its interaction with the other variables involved in a phenomenon. Among other different statistical models, are the autoregressive state-space models (ARSS) and the linear regression models (LR), which allow the quantification of the relationships among soil-plant-atmosphere system variables. To compare the quality of the ARSS and LR models for the modeling of the relationships between soybean yield and soil physical properties, Akaike's Information Criterion, which provides a coefficient for the selection of the best model, was used in this study. The data sets were sampled in a Rhodic Acrudox soil, along a spatial transect with 84 points spaced 3 m apart. At each sampling point, soybean samples were collected for yield quantification. At the same site, soil penetration resistance was also measured and soil samples were collected to measure soil bulk density in the 0-0.10 m and 0.10-0.20 m layers. Results showed autocorrelation and a cross correlation structure of soybean yield and soil penetration resistance data. Soil bulk density data, however, were only autocorrelated in the 0-0.10 m layer and not cross correlated with soybean yield. The results showed the higher efficiency of the autoregressive space-state models in relation to the equivalent simple and multiple linear regression models using Akaike's Information Criterion. The resulting values were comparatively lower than the values obtained by the regression models, for all combinations of explanatory variables.

Systematic trends in self-consistent calculations of linear quantum wires

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Systematic trends in the properties of a linear split-gate heterojunction are studied by solving iteratively the Poisson and Schrödinger equations for different gate potentials and temperatures. A two-dimensional approximation is presented that is much simpler in the numerical implementation and that accurately reproduces all significant trends. In deriving this approximation, we provide a rigorous and quantitative basis for the formulation of models that assumes a two-dimensional character for the electron gas at the junction.

Do habitat suitability models reliably predict the recovery areas of threatened species?

Relevância:

30.00% 30.00%

Publicador:

Resumo:

1. Identifying those areas suitable for recolonization by threatened species is essential to support efficient conservation policies. Habitat suitability models (HSM) predict species' potential distributions, but the quality of their predictions should be carefully assessed when the species-environment equilibrium assumption is violated.2. We studied the Eurasian otter Lutra lutra, whose numbers are recovering in southern Italy. To produce widely applicable results, we chose standard HSM procedures and looked for the models' capacities in predicting the suitability of a recolonization area. We used two fieldwork datasets: presence-only data, used in the Ecological Niche Factor Analyses (ENFA), and presence-absence data, used in a Generalized Linear Model (GLM). In addition to cross-validation, we independently evaluated the models with data from a recolonization event, providing presences on a previously unoccupied river.3. Three of the models successfully predicted the suitability of the recolonization area, but the GLM built with data before the recolonization disagreed with these predictions, missing the recolonized river's suitability and badly describing the otter's niche. Our results highlighted three points of relevance to modelling practices: (1) absences may prevent the models from correctly identifying areas suitable for a species spread; (2) the selection of variables may lead to randomness in the predictions; and (3) the Area Under Curve (AUC), a commonly used validation index, was not well suited to the evaluation of model quality, whereas the Boyce Index (CBI), based on presence data only, better highlighted the models' fit to the recolonization observations.4. For species with unstable spatial distributions, presence-only models may work better than presence-absence methods in making reliable predictions of suitable areas for expansion. An iterative modelling process, using new occurrences from each step of the species spread, may also help in progressively reducing errors.5. Synthesis and applications. Conservation plans depend on reliable models of the species' suitable habitats. In non-equilibrium situations, such as the case for threatened or invasive species, models could be affected negatively by the inclusion of absence data when predicting the areas of potential expansion. Presence-only methods will here provide a better basis for productive conservation management practices.

Effects of age, birth cohort and period of death on Swiss cancer mortality, 1951-1984.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Swiss death certification data over the period 1951-1984 for total cancer mortality and 30 major cancer sites in the population aged 25 to 74 years were analysed using a log-linear Poisson model with arbitrary constraints on the parameters to isolate the effects of birth cohort, calendar period of death and age. The overall pattern of total cancer mortality in males was stable for period values and showed some moderate decreases in cohort values restricted to the generations born after 1930. Cancer mortality trends were more favourable in females, with steady, though moderate, declines in both cohort and period values. According to the estimates from the model, the worst affected generation for male lung cancer was that born around 1910, and a flattening of trends or some moderate decline was observed for more recent cohorts, although this decline was considerably more limited than in other European countries. There were decreases in cohort and period values for stomach, intestine and oesophageal cancer in both sexes and (cervix) uteri in females. Increases were observed in both cohort and period trends for pancreas and liver in males and for several other neoplasms, including prostate, brain, leukaemias and lymphomas, restricted, however, for the latter sites, to the earlier cohorts and hence partly attributable to improved diagnosis and certification in the elderly. Although age values for lung cancer in females were around 10-times lower than in males, upward trends in female lung cancer cohort values were observed in subsequent cohorts and for period values from the late 1960's onwards. Therefore, future trends in female lung cancer mortality should continue to be monitored. The application of these age/period/cohort models thus provides a summary guide for the reading and interpretation of cancer mortality trends, although it cannot replace careful inspection of single age-specific rates.

Origin of the neutron skin thickness of 208Pb in nuclear mean-field models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study whether the neutron skin thickness Δrnp of 208Pb originates from the bulk or from the surface of the nucleon density distributions, according to the mean-field models of nuclear structure, and find that it depends on the stiffness of the nuclear symmetry energy. The bulk contribution to Δrnp arises from an extended sharp radius of neutrons, whereas the surface contribution arises from different widths of the neutron and proton surfaces. Nuclear models where the symmetry energy is stiff, as typical of relativistic models, predict a bulk contribution in Δrnp of 208Pb about twice as large as the surface contribution. In contrast, models with a soft symmetry energy like common nonrelativistic models predict that Δrnp of 208Pb is divided similarly into bulk and surface parts. Indeed, if the symmetry energy is supersoft, the surface contribution becomes dominant. We note that the linear correlation of Δrnp of 208Pb with the density derivative of the nuclear symmetry energy arises from the bulk part of Δrnp. We also note that most models predict a mixed-type (between halo and skin) neutron distribution for 208Pb. Although the halo-type limit is actually found in the models with a supersoft symmetry energy, the skin-type limit is not supported by any mean-field model. Finally, we compute parity-violating electron scattering in the conditions of the 208Pb parity radius experiment (PREX) and obtain a pocket formula for the parity-violating asymmetry in terms of the parameters that characterize the shape of the 208Pb nucleon densities.

Building predictive models of soil particle-size distribution

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Is it possible to build predictive models (PMs) of soil particle-size distribution (psd) in a region with complex geology and a young and unstable land-surface? The main objective of this study was to answer this question. A set of 339 soil samples from a small slope catchment in Southern Brazil was used to build PMs of psd in the surface soil layer. Multiple linear regression models were constructed using terrain attributes (elevation, slope, catchment area, convergence index, and topographic wetness index). The PMs explained more than half of the data variance. This performance is similar to (or even better than) that of the conventional soil mapping approach. For some size fractions, the PM performance can reach 70 %. Largest uncertainties were observed in geologically more complex areas. Therefore, significant improvements in the predictions can only be achieved if accurate geological data is made available. Meanwhile, PMs built on terrain attributes are efficient in predicting the particle-size distribution (psd) of soils in regions of complex geology.

Evaluation of statistical and geostatistical models of digital soil properties mapping in tropical mountain regions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Soil properties have an enormous impact on economic and environmental aspects of agricultural production. Quantitative relationships between soil properties and the factors that influence their variability are the basis of digital soil mapping. The predictive models of soil properties evaluated in this work are statistical (multiple linear regression-MLR) and geostatistical (ordinary kriging and co-kriging). The study was conducted in the municipality of Bom Jardim, RJ, using a soil database with 208 sampling points. Predictive models were evaluated for sand, silt and clay fractions, pH in water and organic carbon at six depths according to the specifications of the consortium of digital soil mapping at the global level (GlobalSoilMap). Continuous covariates and categorical predictors were used and their contributions to the model assessed. Only the environmental covariates elevation, aspect, stream power index (SPI), soil wetness index (SWI), normalized difference vegetation index (NDVI), and b3/b2 band ratio were significantly correlated with soil properties. The predictive models had a mean coefficient of determination of 0.21. Best results were obtained with the geostatistical predictive models, where the highest coefficient of determination 0.43 was associated with sand properties between 60 to 100 cm deep. The use of a sparse data set of soil properties for digital mapping can explain only part of the spatial variation of these properties. The results may be related to the sampling density and the quantity and quality of the environmental covariates and predictive models used.

Statistics of depth probed by cw measurement of photons in a turbid medium

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Photon migration in a turbid medium has been modeled in many different ways. The motivation for such modeling is based on technology that can be used to probe potentially diagnostic optical properties of biological tissue. Surprisingly, one of the more effective models is also one of the simplest. It is based on statistical properties of a nearest-neighbor lattice random walk. Here we develop a theory allowing one to calculate the number of visits by a photon to a given depth, if it is eventually detected at an absorbing surface. This mimics cw measurements made on biological tissue and is directed towards characterizing the depth reached by photons injected at the surface. Our development of the theory uses formalism based on the theory of a continuous-time random walk (CTRW). Formally exact results are given in the Fourier-Laplace domain, which, in turn, are used to generate approximations for parameters of physical interest.

Exact temporal evolution for some non-linear diffusion process

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Exact solutions to FokkerPlanck equations with nonlinear drift are considered. Applications of these exact solutions for concrete models are studied. We arrive at the conclusion that for certain drifts we obtain divergent moments (and infinite relaxation time) if the diffusion process can be extended without any obstacle to the whole space. But if we introduce a potential barrier that limits the diffusion process, moments converge with a finite relaxation time.

«
1
2
...
53
54
55
56
57
58
59
...
63
64
»