Biblioteca Digital

963 resultados para STATISTICAL MODELS

Estimation and diagnostics for heteroscedastic nonlinear regression models based on scale mixtures of skew-normal distributions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An extension of some standard likelihood based procedures to heteroscedastic nonlinear regression models under scale mixtures of skew-normal (SMSN) distributions is developed. This novel class of models provides a useful generalization of the heteroscedastic symmetrical nonlinear regression models (Cysneiros et al., 2010), since the random term distributions cover both symmetric as well as asymmetric and heavy-tailed distributions such as skew-t, skew-slash, skew-contaminated normal, among others. A simple EM-type algorithm for iteratively computing maximum likelihood estimates of the parameters is presented and the observed information matrix is derived analytically. In order to examine the performance of the proposed methods, some simulation studies are presented to show the robust aspect of this flexible class against outlying and influential observations and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties. Furthermore, local influence measures and the one-step approximations of the estimates in the case-deletion model are obtained. Finally, an illustration of the methodology is given considering a data set previously analyzed under the homoscedastic skew-t nonlinear regression model. (C) 2012 Elsevier B.V. All rights reserved.

Predictive models for mutations in mismatch repair genes: implication for genetic counseling in developing countries

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Lynch syndrome (LS) is the most common form of inherited predisposition to colorectal cancer (CRC), accounting for 2-5% of all CRC. LS is an autosomal dominant disease characterized by mutations in the mismatch repair genes mutL homolog 1 (MLH1), mutS homolog 2 (MSH2), postmeiotic segregation increased 1 (PMS1), post-meiotic segregation increased 2 (PMS2) and mutS homolog 6 (MSH6). Mutation risk prediction models can be incorporated into clinical practice, facilitating the decision-making process and identifying individuals for molecular investigation. This is extremely important in countries with limited economic resources. This study aims to evaluate sensitivity and specificity of five predictive models for germline mutations in repair genes in a sample of individuals with suspected Lynch syndrome. Methods: Blood samples from 88 patients were analyzed through sequencing MLH1, MSH2 and MSH6 genes. The probability of detecting a mutation was calculated using the PREMM, Barnetson, MMRpro, Wijnen and Myriad models. To evaluate the sensitivity and specificity of the models, receiver operating characteristic curves were constructed. Results: Of the 88 patients included in this analysis, 31 mutations were identified: 16 were found in the MSH2 gene, 15 in the MLH1 gene and no pathogenic mutations were identified in the MSH6 gene. It was observed that the AUC for the PREMM (0.846), Barnetson (0.850), MMRpro (0.821) and Wijnen (0.807) models did not present significant statistical difference. The Myriad model presented lower AUC (0.704) than the four other models evaluated. Considering thresholds of >= 5%, the models sensitivity varied between 1 (Myriad) and 0.87 (Wijnen) and specificity ranged from 0 (Myriad) to 0.38 (Barnetson). Conclusions: The Barnetson, PREMM, MMRpro and Wijnen models present similar AUC. The AUC of the Myriad model is statistically inferior to the four other models.

Anderson-like Transition for a Class of Random Sparse Models in d >= 2 Dimensions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We show that the Kronecker sum of d >= 2 copies of a random one-dimensional sparse model displays a spectral transition of the type predicted by Anderson, from absolutely continuous around the center of the band to pure point around the boundaries. Possible applications to physics and open problems are discussed briefly.

Local power properties of some asymptotic tests in symmetric linear regression models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we obtain asymptotic expansions, up to order n(-1/2) and under a sequence of Pitman alternatives, for the nonnull distribution functions of the likelihood ratio, Wald, score and gradient test statistics in the class of symmetric linear regression models. This is a wide class of models which encompasses the t model and several other symmetric distributions with longer-than normal tails. The asymptotic distributions of all four statistics are obtained for testing a subset of regression parameters. Furthermore, in order to compare the finite-sample performance of these tests in this class of models, Monte Carlo simulations are presented. An empirical application to a real data set is considered for illustrative purposes. (C) 2011 Elsevier B.V. All rights reserved.

Identifying regulational alterations in gene regulatory networks by state space representation of vector autoregressive models and variational annealing

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: In the analysis of effects by cell treatment such as drug dosing, identifying changes on gene network structures between normal and treated cells is a key task. A possible way for identifying the changes is to compare structures of networks estimated from data on normal and treated cells separately. However, this approach usually fails to estimate accurate gene networks due to the limited length of time series data and measurement noise. Thus, approaches that identify changes on regulations by using time series data on both conditions in an efficient manner are demanded. Methods: We propose a new statistical approach that is based on the state space representation of the vector autoregressive model and estimates gene networks on two different conditions in order to identify changes on regulations between the conditions. In the mathematical model of our approach, hidden binary variables are newly introduced to indicate the presence of regulations on each condition. The use of the hidden binary variables enables an efficient data usage; data on both conditions are used for commonly existing regulations, while for condition specific regulations corresponding data are only applied. Also, the similarity of networks on two conditions is automatically considered from the design of the potential function for the hidden binary variables. For the estimation of the hidden binary variables, we derive a new variational annealing method that searches the configuration of the binary variables maximizing the marginal likelihood. Results: For the performance evaluation, we use time series data from two topologically similar synthetic networks, and confirm that our proposed approach estimates commonly existing regulations as well as changes on regulations with higher coverage and precision than other existing approaches in almost all the experimental settings. For a real data application, our proposed approach is applied to time series data from normal Human lung cells and Human lung cells treated by stimulating EGF-receptors and dosing an anticancer drug termed Gefitinib. In the treated lung cells, a cancer cell condition is simulated by the stimulation of EGF-receptors, but the effect would be counteracted due to the selective inhibition of EGF-receptors by Gefitinib. However, gene expression profiles are actually different between the conditions, and the genes related to the identified changes are considered as possible off-targets of Gefitinib. Conclusions: From the synthetically generated time series data, our proposed approach can identify changes on regulations more accurately than existing methods. By applying the proposed approach to the time series data on normal and treated Human lung cells, candidates of off-target genes of Gefitinib are found. According to the published clinical information, one of the genes can be related to a factor of interstitial pneumonia, which is known as a side effect of Gefitinib.

Bartlett corrections in Birnbaum-Saunders nonlinear regression models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lemonte and Cordeiro [Birnbaum-Saunders nonlinear regression models, Comput. Stat. Data Anal. 53 (2009), pp. 4441-4452] introduced a class of Birnbaum-Saunders (BS) nonlinear regression models potentially useful in lifetime data analysis. We give a general matrix Bartlett correction formula to improve the likelihood ratio (LR) tests in these models. The formula is simple enough to be used analytically to obtain several closed-form expressions in special cases. Our results generalize those in Lemonte et al. [Improved likelihood inference in Birnbaum-Saunders regressions, Comput. Stat. DataAnal. 54 (2010), pp. 1307-1316], which hold only for the BS linear regression models. We consider Monte Carlo simulations to show that the corrected tests work better than the usual LR tests.

Stochastic simulation of time-series models combined with geostatistics to predict water-table scenarios in a Guarani Aquifer System outcrop area, Brazil

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Stochastic methods based on time-series modeling combined with geostatistics can be useful tools to describe the variability of water-table levels in time and space and to account for uncertainty. Monitoring water-level networks can give information about the dynamic of the aquifer domain in both dimensions. Time-series modeling is an elegant way to treat monitoring data without the complexity of physical mechanistic models. Time-series model predictions can be interpolated spatially, with the spatial differences in water-table dynamics determined by the spatial variation in the system properties and the temporal variation driven by the dynamics of the inputs into the system. An integration of stochastic methods is presented, based on time-series modeling and geostatistics as a framework to predict water levels for decision making in groundwater management and land-use planning. The methodology is applied in a case study in a Guarani Aquifer System (GAS) outcrop area located in the southeastern part of Brazil. Communication of results in a clear and understandable form, via simulated scenarios, is discussed as an alternative, when translating scientific knowledge into applications of stochastic hydrogeology in large aquifers with limited monitoring network coverage like the GAS.

STATISTICAL TEST FOR GENOTYPE AND ENVIRONMENT CONTRIBUTION IN THE GENOTYPES x ENVIRONMENTS INTERACTION MATRIX

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of the present work was to propose a method for testing the contribution of each level of the factors in a genotypes x environments (GxE) interaction using multi-environment trials analyses by means of an F test. The study evaluated a data set, with twenty genotypes and thirty-four environments, in a block design with four replications. The sum of squares within rows (genotypes) and columns (environments) of the GxE matrix was simulated, generating 10000 experiments to verify the empirical distribution. Results indicate a noncentral chi-square distribution for rows and columns of the GxE interaction matrix, which was also verified by the Kolmogorov-Smirnov test and Q-Q plot. Application of the F test identified the genotypes and environments that contributed the most to the GxE interaction. In this way, geneticists can select good genotypes in their studies.

Exact correlation functions in particle-reaction models with immobile particles

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Exact results on particle densities as well as correlators in two models of immobile particles, containing either a single species or else two distinct species, are derived. The models evolve following a descent dynamics through pair annihilation where each particle interacts once at most throughout its entire history. The resulting large number of stationary states leads to a non-vanishing configurational entropy. Our results are established for arbitrary initial conditions and are derived via a generating function method. The single-species model is the dual of the 1D zero-temperature kinetic Ising model with Kimball-Deker-Haake dynamics. In this way, both in finite and semi-infinite chains and also the Bethe lattice can be analysed. The relationship with the random sequential adsorption of dimers and weakly tapped granular materials is discussed.

Robust statistical modeling using the Birnbaum-Saunders-t distribution applied to insurance

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we carry out robust modeling and influence diagnostics in Birnbaum-Saunders (BS) regression models. Specifically, we present some aspects related to BS and log-BS distributions and their generalizations from the Student-t distribution, and develop BS-t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools. In addition, we apply the obtained results to real data from insurance, which shows the uses of the proposed model. Copyright (c) 2011 John Wiley & Sons, Ltd.

On the impact of disproportional samples in credit scoring models: An application to a Brazilian bank data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Statistical methods have been widely employed to assess the capabilities of credit scoring classification models in order to reduce the risk of wrong decisions when granting credit facilities to clients. The predictive quality of a classification model can be evaluated based on measures such as sensitivity, specificity, predictive values, accuracy, correlation coefficients and information theoretical measures, such as relative entropy and mutual information. In this paper we analyze the performance of a naive logistic regression model (Hosmer & Lemeshow, 1989) and a logistic regression with state-dependent sample selection model (Cramer, 2004) applied to simulated data. Also, as a case study, the methodology is illustrated on a data set extracted from a Brazilian bank portfolio. Our simulation results so far revealed that there is no statistically significant difference in terms of predictive capacity between the naive logistic regression models and the logistic regression with state-dependent sample selection models. However, there is strong difference between the distributions of the estimated default probabilities from these two statistical modeling techniques, with the naive logistic regression models always underestimating such probabilities, particularly in the presence of balanced samples. (C) 2012 Elsevier Ltd. All rights reserved.

Influence diagnostics for elliptical semiparametric mixed models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we extend semiparametric mixed linear models with normal errors to elliptical errors in order to permit distributions with heavier and lighter tails than the normal ones. Penalized likelihood equations are applied to derive the maximum penalized likelihood estimates (MPLEs) which appear to be robust against outlying observations in the sense of the Mahalanobis distance. A reweighed iterative process based on the back-fitting method is proposed for the parameter estimation and the local influence curvatures are derived under some usual perturbation schemes to study the sensitivity of the MPLEs. Two motivating examples preliminarily analyzed under normal errors are reanalyzed considering some appropriate elliptical errors. The local influence approach is used to compare the sensitivity of the model estimates.

Local power and size properties of the LR, Wald, score and gradient tests in dispersion models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We derive asymptotic expansions for the nonnull distribution functions of the likelihood ratio, Wald, score and gradient test statistics in the class of dispersion models, under a sequence of Pitman alternatives. The asymptotic distributions of these statistics are obtained for testing a subset of regression parameters and for testing the precision parameter. Based on these nonnull asymptotic expansions, the power of all four tests, which are equivalent to first order, are compared. Furthermore, in order to compare the finite-sample performance of these tests in this class of models, Monte Carlo simulations are presented. An empirical application to a real data set is considered for illustrative purposes. (C) 2012 Elsevier B.V. All rights reserved.

QSAR models for inhibitors of physiological impact of Escherichia coli that leads to diarrhea

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Quantitative structure – activity relationships (QSARs) developed to evaluate percentage of inhibition of STa-stimulated (Escherichia coli) cGMP accumulation in T84 cells are calculated by the Monte Carlo method. This endpoint represents a measure of biological activity of a substance against diarrhea. Statistical quality of the developed models is quite good. The approach is tested using three random splits of data into the training and test sets. The statistical characteristics for three splits are the following: (1) n = 20, r2 = 0.7208, q2 = 0.6583, s = 16.9, F = 46 (training set); n = 11, r2 = 0.8986, s = 14.6 (test set); (2) n = 19, r2 = 0.6689, q2 = 0.5683, s = 17.6, F = 34 (training set); n = 12, r2 = 0.8998, s = 12.1 (test set); and (3) n = 20, r2 = 0.7141, q2 = 0.6525, s = 14.7, F = 45 (training set); n = 11, r2 = 0.8858, s = 19.5 (test set). Based on the proposed here models hypothetical compounds which can be useful agents against diarrhea are suggested.

Development of musculoskeletal models for the design and the pre-clinical validation of hip resurfacing prosthesis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background. The surgical treatment of dysfunctional hips is a severe condition for the patient and a costly therapy for the public health. Hip resurfacing techniques seem to hold the promise of various advantages over traditional THR, with particular attention to young and active patients. Although the lesson provided in the past by many branches of engineering is that success in designing competitive products can be achieved only by predicting the possible scenario of failure, to date the understanding of the implant quality is poorly pre-clinically addressed. Thus revision is the only delayed and reliable end point for assessment. The aim of the present work was to model the musculoskeletal system so as to develop a protocol for predicting failure of hip resurfacing prosthesis. Methods. Preliminary studies validated the technique for the generation of subject specific finite element (FE) models of long bones from Computed Thomography data. The proposed protocol consisted in the numerical analysis of the prosthesis biomechanics by deterministic and statistic studies so as to assess the risk of biomechanical failure on the different operative conditions the implant might face in a population of interest during various activities of daily living. Physiological conditions were defined including the variability of the anatomy, bone densitometry, surgery uncertainties and published boundary conditions at the hip. The protocol was tested by analysing a successful design on the market and a new prototype of a resurfacing prosthesis. Results. The intrinsic accuracy of models on bone stress predictions (RMSE < 10%) was aligned to the current state of the art in this field. The accuracy of prediction on the bone-prosthesis contact mechanics was also excellent (< 0.001 mm). The sensitivity of models prediction to uncertainties on modelling parameter was found below 8.4%. The analysis of the successful design resulted in a very good agreement with published retrospective studies. The geometry optimisation of the new prototype lead to a final design with a low risk of failure. The statistical analysis confirmed the minimal risk of the optimised design over the entire population of interest. The performances of the optimised design showed a significant improvement with respect to the first prototype (+35%). Limitations. On the authors opinion the major limitation of this study is on boundary conditions. The muscular forces and the hip joint reaction were derived from the few data available in the literature, which can be considered significant but hardly representative of the entire variability of boundary conditions the implant might face over the patients population. This moved the focus of the research on modelling the musculoskeletal system; the ongoing activity is to develop subject-specific musculoskeletal models of the lower limb from medical images. Conclusions. The developed protocol was able to accurately predict known clinical outcomes when applied to a well-established device and, to support the design optimisation phase providing important information on critical characteristics of the patients when applied to a new prosthesis. The presented approach does have a relevant generality that would allow the extension of the protocol to a large set of orthopaedic scenarios with minor changes. Hence, a failure mode analysis criterion can be considered a suitable tool in developing new orthopaedic devices.

«
1
2
...
42
43
44
45
46
47
48
...
64
65
»