973 resultados para statistical speaker models
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Artificial neural networks (ANNs) have been widely applied to the resolution of complex biological problems. An important feature of neural models is that their implementation is not precluded by the theoretical distribution shape of the data used. Frequently, the performance of ANNs over linear or non-linear regression-based statistical methods is deemed to be significantly superior if suitable sample sizes are provided, especially in multidimensional and non-linear processes. The current work was aimed at utilising three well-known neural network methods in order to evaluate whether these models would be able to provide more accurate outcomes in relation to a conventional regression method in pupal weight predictions of Chrysomya megacephala, a species of blowfly (Diptera: Calliphoridae), using larval density (i.e. the initial number of larvae), amount of available food and pupal size as input data. It was possible to notice that the neural networks yielded more accurate performances in comparison with the statistical model (multiple regression). Assessing the three types of networks utilised (Multi-layer Perceptron, Radial Basis Function and Generalised Regression Neural Network), no considerable differences between these models were detected. The superiority of these neural models over a classical statistical method represents an important fact, because more accurate models may clarify several intricate aspects concerning the nutritional ecology of blowflies.
Resumo:
In accelerating dark energy models, the estimates of the Hubble constant, Ho, from Sunyaev-Zerdovich effect (SZE) and X-ray surface brightness of galaxy clusters may depend on the matter content (Omega(M)), the curvature (Omega(K)) and the equation of state parameter GO. In this article, by using a sample of 25 angular diameter distances of galaxy clusters described by the elliptical beta model obtained through the SZE/X-ray technique, we constrain Ho in the framework of a general ACDM model (arbitrary curvature) and a flat XCDM model with a constant equation of state parameter omega = p(x)/rho(x). In order to avoid the use of priors in the cosmological parameters, we apply a joint analysis involving the baryon acoustic oscillations (BA()) and the (MB Shift Parameter signature. By taking into account the statistical and systematic errors of the SZE/X-ray technique we obtain for nonflat ACDM model H-0 = 74(-4.0)(+5.0) km s(-1) Mpc(-1) (1 sigma) whereas for a fiat universe with constant equation of state parameter we find H-0 = 72(-4.0)(+5.5) km s(-1) Mpc(-1)(1 sigma). By assuming that galaxy clusters are described by a spherical beta model these results change to H-0 = 6(-7.0)(+8.0) and H-0 = 59(-6.0)(+9.0) km s(-1) Mpc(-1)(1 sigma), respectively. The results from elliptical description are in good agreement with independent studies from the Hubble Space Telescope key project and recent estimates based on the Wilkinson Microwave Anisotropy Probe, thereby suggesting that the combination of these three independent phenomena provides an interesting method to constrain the Bubble constant. As an extra bonus, the adoption of the elliptical description is revealed to be a quite realistic assumption. Finally, by comparing these results with a recent determination for a, flat ACDM model using only the SZE/X-ray technique and BAO, we see that the geometry has a very weak influence on H-0 estimates for this combination of data.
Resumo:
An extension of some standard likelihood based procedures to heteroscedastic nonlinear regression models under scale mixtures of skew-normal (SMSN) distributions is developed. This novel class of models provides a useful generalization of the heteroscedastic symmetrical nonlinear regression models (Cysneiros et al., 2010), since the random term distributions cover both symmetric as well as asymmetric and heavy-tailed distributions such as skew-t, skew-slash, skew-contaminated normal, among others. A simple EM-type algorithm for iteratively computing maximum likelihood estimates of the parameters is presented and the observed information matrix is derived analytically. In order to examine the performance of the proposed methods, some simulation studies are presented to show the robust aspect of this flexible class against outlying and influential observations and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties. Furthermore, local influence measures and the one-step approximations of the estimates in the case-deletion model are obtained. Finally, an illustration of the methodology is given considering a data set previously analyzed under the homoscedastic skew-t nonlinear regression model. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Background: Lynch syndrome (LS) is the most common form of inherited predisposition to colorectal cancer (CRC), accounting for 2-5% of all CRC. LS is an autosomal dominant disease characterized by mutations in the mismatch repair genes mutL homolog 1 (MLH1), mutS homolog 2 (MSH2), postmeiotic segregation increased 1 (PMS1), post-meiotic segregation increased 2 (PMS2) and mutS homolog 6 (MSH6). Mutation risk prediction models can be incorporated into clinical practice, facilitating the decision-making process and identifying individuals for molecular investigation. This is extremely important in countries with limited economic resources. This study aims to evaluate sensitivity and specificity of five predictive models for germline mutations in repair genes in a sample of individuals with suspected Lynch syndrome. Methods: Blood samples from 88 patients were analyzed through sequencing MLH1, MSH2 and MSH6 genes. The probability of detecting a mutation was calculated using the PREMM, Barnetson, MMRpro, Wijnen and Myriad models. To evaluate the sensitivity and specificity of the models, receiver operating characteristic curves were constructed. Results: Of the 88 patients included in this analysis, 31 mutations were identified: 16 were found in the MSH2 gene, 15 in the MLH1 gene and no pathogenic mutations were identified in the MSH6 gene. It was observed that the AUC for the PREMM (0.846), Barnetson (0.850), MMRpro (0.821) and Wijnen (0.807) models did not present significant statistical difference. The Myriad model presented lower AUC (0.704) than the four other models evaluated. Considering thresholds of >= 5%, the models sensitivity varied between 1 (Myriad) and 0.87 (Wijnen) and specificity ranged from 0 (Myriad) to 0.38 (Barnetson). Conclusions: The Barnetson, PREMM, MMRpro and Wijnen models present similar AUC. The AUC of the Myriad model is statistically inferior to the four other models.
Resumo:
We show that the Kronecker sum of d >= 2 copies of a random one-dimensional sparse model displays a spectral transition of the type predicted by Anderson, from absolutely continuous around the center of the band to pure point around the boundaries. Possible applications to physics and open problems are discussed briefly.
Resumo:
In this paper we obtain asymptotic expansions, up to order n(-1/2) and under a sequence of Pitman alternatives, for the nonnull distribution functions of the likelihood ratio, Wald, score and gradient test statistics in the class of symmetric linear regression models. This is a wide class of models which encompasses the t model and several other symmetric distributions with longer-than normal tails. The asymptotic distributions of all four statistics are obtained for testing a subset of regression parameters. Furthermore, in order to compare the finite-sample performance of these tests in this class of models, Monte Carlo simulations are presented. An empirical application to a real data set is considered for illustrative purposes. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Background: In the analysis of effects by cell treatment such as drug dosing, identifying changes on gene network structures between normal and treated cells is a key task. A possible way for identifying the changes is to compare structures of networks estimated from data on normal and treated cells separately. However, this approach usually fails to estimate accurate gene networks due to the limited length of time series data and measurement noise. Thus, approaches that identify changes on regulations by using time series data on both conditions in an efficient manner are demanded. Methods: We propose a new statistical approach that is based on the state space representation of the vector autoregressive model and estimates gene networks on two different conditions in order to identify changes on regulations between the conditions. In the mathematical model of our approach, hidden binary variables are newly introduced to indicate the presence of regulations on each condition. The use of the hidden binary variables enables an efficient data usage; data on both conditions are used for commonly existing regulations, while for condition specific regulations corresponding data are only applied. Also, the similarity of networks on two conditions is automatically considered from the design of the potential function for the hidden binary variables. For the estimation of the hidden binary variables, we derive a new variational annealing method that searches the configuration of the binary variables maximizing the marginal likelihood. Results: For the performance evaluation, we use time series data from two topologically similar synthetic networks, and confirm that our proposed approach estimates commonly existing regulations as well as changes on regulations with higher coverage and precision than other existing approaches in almost all the experimental settings. For a real data application, our proposed approach is applied to time series data from normal Human lung cells and Human lung cells treated by stimulating EGF-receptors and dosing an anticancer drug termed Gefitinib. In the treated lung cells, a cancer cell condition is simulated by the stimulation of EGF-receptors, but the effect would be counteracted due to the selective inhibition of EGF-receptors by Gefitinib. However, gene expression profiles are actually different between the conditions, and the genes related to the identified changes are considered as possible off-targets of Gefitinib. Conclusions: From the synthetically generated time series data, our proposed approach can identify changes on regulations more accurately than existing methods. By applying the proposed approach to the time series data on normal and treated Human lung cells, candidates of off-target genes of Gefitinib are found. According to the published clinical information, one of the genes can be related to a factor of interstitial pneumonia, which is known as a side effect of Gefitinib.
Resumo:
Lemonte and Cordeiro [Birnbaum-Saunders nonlinear regression models, Comput. Stat. Data Anal. 53 (2009), pp. 4441-4452] introduced a class of Birnbaum-Saunders (BS) nonlinear regression models potentially useful in lifetime data analysis. We give a general matrix Bartlett correction formula to improve the likelihood ratio (LR) tests in these models. The formula is simple enough to be used analytically to obtain several closed-form expressions in special cases. Our results generalize those in Lemonte et al. [Improved likelihood inference in Birnbaum-Saunders regressions, Comput. Stat. DataAnal. 54 (2010), pp. 1307-1316], which hold only for the BS linear regression models. We consider Monte Carlo simulations to show that the corrected tests work better than the usual LR tests.
Resumo:
Stochastic methods based on time-series modeling combined with geostatistics can be useful tools to describe the variability of water-table levels in time and space and to account for uncertainty. Monitoring water-level networks can give information about the dynamic of the aquifer domain in both dimensions. Time-series modeling is an elegant way to treat monitoring data without the complexity of physical mechanistic models. Time-series model predictions can be interpolated spatially, with the spatial differences in water-table dynamics determined by the spatial variation in the system properties and the temporal variation driven by the dynamics of the inputs into the system. An integration of stochastic methods is presented, based on time-series modeling and geostatistics as a framework to predict water levels for decision making in groundwater management and land-use planning. The methodology is applied in a case study in a Guarani Aquifer System (GAS) outcrop area located in the southeastern part of Brazil. Communication of results in a clear and understandable form, via simulated scenarios, is discussed as an alternative, when translating scientific knowledge into applications of stochastic hydrogeology in large aquifers with limited monitoring network coverage like the GAS.
Resumo:
The objective of the present work was to propose a method for testing the contribution of each level of the factors in a genotypes x environments (GxE) interaction using multi-environment trials analyses by means of an F test. The study evaluated a data set, with twenty genotypes and thirty-four environments, in a block design with four replications. The sum of squares within rows (genotypes) and columns (environments) of the GxE matrix was simulated, generating 10000 experiments to verify the empirical distribution. Results indicate a noncentral chi-square distribution for rows and columns of the GxE interaction matrix, which was also verified by the Kolmogorov-Smirnov test and Q-Q plot. Application of the F test identified the genotypes and environments that contributed the most to the GxE interaction. In this way, geneticists can select good genotypes in their studies.
Resumo:
Exact results on particle densities as well as correlators in two models of immobile particles, containing either a single species or else two distinct species, are derived. The models evolve following a descent dynamics through pair annihilation where each particle interacts once at most throughout its entire history. The resulting large number of stationary states leads to a non-vanishing configurational entropy. Our results are established for arbitrary initial conditions and are derived via a generating function method. The single-species model is the dual of the 1D zero-temperature kinetic Ising model with Kimball-Deker-Haake dynamics. In this way, both in finite and semi-infinite chains and also the Bethe lattice can be analysed. The relationship with the random sequential adsorption of dimers and weakly tapped granular materials is discussed.
Resumo:
In this paper, we carry out robust modeling and influence diagnostics in Birnbaum-Saunders (BS) regression models. Specifically, we present some aspects related to BS and log-BS distributions and their generalizations from the Student-t distribution, and develop BS-t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools. In addition, we apply the obtained results to real data from insurance, which shows the uses of the proposed model. Copyright (c) 2011 John Wiley & Sons, Ltd.
Resumo:
Statistical methods have been widely employed to assess the capabilities of credit scoring classification models in order to reduce the risk of wrong decisions when granting credit facilities to clients. The predictive quality of a classification model can be evaluated based on measures such as sensitivity, specificity, predictive values, accuracy, correlation coefficients and information theoretical measures, such as relative entropy and mutual information. In this paper we analyze the performance of a naive logistic regression model (Hosmer & Lemeshow, 1989) and a logistic regression with state-dependent sample selection model (Cramer, 2004) applied to simulated data. Also, as a case study, the methodology is illustrated on a data set extracted from a Brazilian bank portfolio. Our simulation results so far revealed that there is no statistically significant difference in terms of predictive capacity between the naive logistic regression models and the logistic regression with state-dependent sample selection models. However, there is strong difference between the distributions of the estimated default probabilities from these two statistical modeling techniques, with the naive logistic regression models always underestimating such probabilities, particularly in the presence of balanced samples. (C) 2012 Elsevier Ltd. All rights reserved.