994 resultados para Variance estimation


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Our objective was to estimate Bos primigenius taurus introgression in American Zebu cattle. One hundred and four American Zebu (Nellore) cattle were submitted to mtDNA, microsatellite and satellite analysis. Twenty-three alleles were detected in microsatellite analysis, averaging 4.6 +/- 1.82/locus. Variance component comparisons of microsatellite allele sizes allowed the construction of two clusters separating taurus and indicus. No significant variation was observed when indicus and taurus mtDNA were compared. Three possible genotypes of 1711b satellite DNA were identified. All European animals showed the same restriction pattern, suggesting a Zebu-specific restriction pattern. The frequencies of B. primigenius indicus-specific microsatellite alleles and 1711b satellite DNA restriction patterns lead to an estimate of 14% taurine contribution in purebred Nellore.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The motivating problem concerns the estimation of the growth curve of solitary corals that follow the nonlinear Von Bertalanffy Growth Function (VBGF). The most common parameterization of the VBGF for corals is based on two parameters: the ultimate length L∞ and the growth rate k. One aim was to find a more reliable method for estimating these parameters, which can capture the influence of environmental covariates. The main issue with current methods is that they force the linearization of VBGF and neglect intra-individual variability. The idea was to use the hierarchical nonlinear model which has the appealing features of taking into account the influence of collection sites, possible intra-site measurement correlation and variance heterogeneity, and that can handle the influence of environmental factors and all the reliable information that might influence coral growth. This method was used on two databases of different solitary corals i.e. Balanophyllia europaea and Leptopsammia pruvoti, collected in six different sites in different environmental conditions, which introduced a decisive improvement in the results. Nevertheless, the theory of the energy balance in growth ascertains the linear correlation of the two parameters and the independence of the ultimate length L∞ from the influence of environmental covariates, so a further aim of the thesis was to propose a new parameterization based on the ultimate length and parameter c which explicitly describes the part of growth ascribable to site-specific conditions such as environmental factors. We explored the possibility of estimating these parameters characterizing the VBGF new parameterization via the nonlinear hierarchical model. Again there was a general improvement with respect to traditional methods. The results of the two parameterizations were similar, although a very slight improvement was observed in the new one. This is, nevertheless, more suitable from a theoretical point of view when considering environmental covariates.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper discusses estimation of the tumor incidence rate, the death rate given tumor is present and the death rate given tumor is absent using a discrete multistage model. The model was originally proposed by Dewanji and Kalbfleisch (1986) and the maximum likelihood estimate of the tumor incidence rate was obtained using EM algorithm. In this paper, we use a reparametrization to simplify the estimation procedure. The resulting estimates are not always the same as the maximum likelihood estimates but are asymptotically equivalent. In addition, an explicit expression for asymptotic variance and bias of the proposed estimators is also derived. These results can be used to compare efficiency of different sacrifice schemes in carcinogenicity experiments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Functional Magnetic Resonance Imaging (fMRI) is a non-invasive technique which is commonly used to quantify changes in blood oxygenation and flow coupled to neuronal activation. One of the primary goals of fMRI studies is to identify localized brain regions where neuronal activation levels vary between groups. Single voxel t-tests have been commonly used to determine whether activation related to the protocol differs across groups. Due to the generally limited number of subjects within each study, accurate estimation of variance at each voxel is difficult. Thus, combining information across voxels in the statistical analysis of fMRI data is desirable in order to improve efficiency. Here we construct a hierarchical model and apply an Empirical Bayes framework on the analysis of group fMRI data, employing techniques used in high throughput genomic studies. The key idea is to shrink residual variances by combining information across voxels, and subsequently to construct an improved test statistic in lieu of the classical t-statistic. This hierarchical model results in a shrinkage of voxel-wise residual sample variances towards a common value. The shrunken estimator for voxelspecific variance components on the group analyses outperforms the classical residual error estimator in terms of mean squared error. Moreover, the shrunken test-statistic decreases false positive rate when testing differences in brain contrast maps across a wide range of simulation studies. This methodology was also applied to experimental data regarding a cognitive activation task.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recurrent event data are largely characterized by the rate function but smoothing techniques for estimating the rate function have never been rigorously developed or studied in statistical literature. This paper considers the moment and least squares methods for estimating the rate function from recurrent event data. With an independent censoring assumption on the recurrent event process, we study statistical properties of the proposed estimators and propose bootstrap procedures for the bandwidth selection and for the approximation of confidence intervals in the estimation of the occurrence rate function. It is identified that the moment method without resmoothing via a smaller bandwidth will produce curve with nicks occurring at the censoring times, whereas there is no such problem with the least squares method. Furthermore, the asymptotic variance of the least squares estimator is shown to be smaller under regularity conditions. However, in the implementation of the bootstrap procedures, the moment method is computationally more efficient than the least squares method because the former approach uses condensed bootstrap data. The performance of the proposed procedures is studied through Monte Carlo simulations and an epidemiological example on intravenous drug users.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In regression analysis, covariate measurement error occurs in many applications. The error-prone covariates are often referred to as latent variables. In this proposed study, we extended the study of Chan et al. (2008) on recovering latent slope in a simple regression model to that in a multiple regression model. We presented an approach that applied the Monte Carlo method in the Bayesian framework to the parametric regression model with the measurement error in an explanatory variable. The proposed estimator applied the conditional expectation of latent slope given the observed outcome and surrogate variables in the multiple regression models. A simulation study was presented showing that the method produces estimator that is efficient in the multiple regression model, especially when the measurement error variance of surrogate variable is large.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we investigate whether conventional text categorization methods may suffice to infer different verbal intelligence levels. This research goal relies on the hypothesis that the vocabulary that speakers make use of reflects their verbal intelligence levels. Automatic verbal intelligence estimation of users in a spoken language dialog system may be useful when defining an optimal dialog strategy by improving its adaptation capabilities. The work is based on a corpus containing descriptions (i.e. monologs) of a short film by test persons yielding different educational backgrounds and the verbal intelligence scores of the speakers. First, a one-way analysis of variance was performed to compare the monologs with the film transcription and to demonstrate that there are differences in the vocabulary used by the test persons yielding different verbal intelligence levels. Then, for the classification task, the monologs were represented as feature vectors using the classical TF–IDF weighting scheme. The Naive Bayes, k-nearest neighbors and Rocchio classifiers were tested. In this paper we describe and compare these classification approaches, define the optimal classification parameters and discuss the classification results obtained.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a time-domain stochastic system identification method based on maximum likelihood estimation (MLE) with the expectation maximization (EM) algorithm. The effectiveness of this structural identification method is evaluated through numerical simulation in the context of the ASCE benchmark problem on structural health monitoring. The benchmark structure is a four-story, two-bay by two-bay steel-frame scale model structure built in the Earthquake Engineering Research Laboratory at the University of British Columbia, Canada. This paper focuses on Phase I of the analytical benchmark studies. A MATLAB-based finite element analysis code obtained from the IASC-ASCE SHM Task Group web site is used to calculate the dynamic response of the prototype structure. A number of 100 simulations have been made using this MATLAB-based finite element analysis code in order to evaluate the proposed identification method. There are several techniques to realize system identification. In this work, stochastic subspace identification (SSI)method has been used for comparison. SSI identification method is a well known method and computes accurate estimates of the modal parameters. The principles of the SSI identification method has been introduced in the paper and next the proposed MLE with EM algorithm has been explained in detail. The advantages of the proposed structural identification method can be summarized as follows: (i) the method is based on maximum likelihood, that implies minimum variance estimates; (ii) EM is a computational simpler estimation procedure than other optimization algorithms; (iii) estimate more parameters than SSI, and these estimates are accurate. On the contrary, the main disadvantages of the method are: (i) EM algorithm is an iterative procedure and it consumes time until convergence is reached; and (ii) this method needs starting values for the parameters. Modal parameters (eigenfrequencies, damping ratios and mode shapes) of the benchmark structure have been estimated using both the SSI method and the proposed MLE + EM method. The numerical results show that the proposed method identifies eigenfrequencies, damping ratios and mode shapes reasonably well even in the presence of 10% measurement noises. These modal parameters are more accurate than the SSI estimated modal parameters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Prediction at ungauged sites is essential for water resources planning and management. Ungauged sites have no observations about the magnitude of floods, but some site and basin characteristics are known. Regression models relate physiographic and climatic basin characteristics to flood quantiles, which can be estimated from observed data at gauged sites. However, these models assume linear relationships between variables Prediction intervals are estimated by the variance of the residuals in the estimated model. Furthermore, the effect of the uncertainties in the explanatory variables on the dependent variable cannot be assessed. This paper presents a methodology to propagate the uncertainties that arise in the process of predicting flood quantiles at ungauged basins by a regression model. In addition, Bayesian networks were explored as a feasible tool for predicting flood quantiles at ungauged sites. Bayesian networks benefit from taking into account uncertainties thanks to their probabilistic nature. They are able to capture non-linear relationships between variables and they give a probability distribution of discharges as result. The methodology was applied to a case study in the Tagus basin in Spain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

devcon transforms the coefficients of 0/1 dummy variables so that they reflect deviations from the "grand mean" rather than deviations from the reference category (the transformed coefficients are equivalent to those obtained by the so called "effects coding") and adds the coefficient for the reference category. The variance-covariance matrix of the estimates is transformed accordingly. The transformed estimated can be used with post estimation procedures. In particular, devcon can be used to solve the identification problem for dummy variable effects in the so-called Blinder-Oaxaca decomposition (see the oaxaca package).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Spatial characterization of non-Gaussian attributes in earth sciences and engineering commonly requires the estimation of their conditional distribution. The indicator and probability kriging approaches of current nonparametric geostatistics provide approximations for estimating conditional distributions. They do not, however, provide results similar to those in the cumbersome implementation of simultaneous cokriging of indicators. This paper presents a new formulation termed successive cokriging of indicators that avoids the classic simultaneous solution and related computational problems, while obtaining equivalent results to the impractical simultaneous solution of cokriging of indicators. A successive minimization of the estimation variance of probability estimates is performed, as additional data are successively included into the estimation process. In addition, the approach leads to an efficient nonparametric simulation algorithm for non-Gaussian random functions based on residual probabilities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate whether relative contributions of genetic and shared environmental factors are associated with an increased risk in melanoma. Data from the Queensland Familial Melanoma Project comprising 15,907 subjects arising from 1912 families were analyzed to estimate the additive genetic, common and unique environmental contributions to variation in the age at onset of melanoma. Two complementary approaches for analyzing correlated time-to-onset family data were considered: the generalized estimating equations (GEE) method in which one can estimate relationship-specific dependence simultaneously with regression coefficients that describe the average population response to changing covariates; and a subject-specific Bayesian mixed model in which heterogeneity in regression parameters is explicitly modeled and the different components of variation may be estimated directly. The proportional hazards and Weibull models were utilized, as both produce natural frameworks for estimating relative risks while adjusting for simultaneous effects of other covariates. A simple Markov Chain Monte Carlo method for covariate imputation of missing data was used and the actual implementation of the Bayesian model was based on Gibbs sampling using the free ware package BUGS. In addition, we also used a Bayesian model to investigate the relative contribution of genetic and environmental effects on the expression of naevi and freckles, which are known risk factors for melanoma.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genetic assignment methods use genotype likelihoods to draw inference about where individuals were or were not born, potentially allowing direct, real-time estimates of dispersal. We used simulated data sets to test the power and accuracy of Monte Carlo resampling methods in generating statistical thresholds for identifying F-0 immigrants in populations with ongoing gene flow, and hence for providing direct, real-time estimates of migration rates. The identification of accurate critical values required that resampling methods preserved the linkage disequilibrium deriving from recent generations of immigrants and reflected the sampling variance present in the data set being analysed. A novel Monte Carlo resampling method taking into account these aspects was proposed and its efficiency was evaluated. Power and error were relatively insensitive to the frequency assumed for missing alleles. Power to identify F-0 immigrants was improved by using large sample size (up to about 50 individuals) and by sampling all populations from which migrants may have originated. A combination of plotting genotype likelihoods and calculating mean genotype likelihood ratios (D-LR) appeared to be an effective way to predict whether F-0 immigrants could be identified for a particular pair of populations using a given set of markers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Single male sexually selected traits have been found to exhibit substantial genetic variance, even though natural and sexual selection are predicted to deplete genetic variance in these traits. We tested whether genetic variance in multiple male display traits of Drosophila serrata was maintained under field conditions. A breeding design involving 300 field-reared males and their laboratory-reared offspring allowed the estimation of the genetic variance-covariance matrix for six male cuticular hydrocarbons (CHCs) under field conditions. Despite individual CHCs displaying substantial genetic variance under field conditions, the vast majority of genetic variance in CHCs was not closely associated with the direction of sexual selection measured on field phenotypes. Relative concentrations of three CHCs correlated positively with body size in the field, but not under laboratory conditions, suggesting condition-dependent expression of CHCs under field conditions. Therefore condition dependence may not maintain genetic variance in preferred combinations of male CHCs under field conditions, suggesting that the large mutational target supplied by the evolution of condition dependence may not provide a solution to the lek paradox in this species. Sustained sexual selection may be adequate to deplete genetic variance in the direction of selection, perhaps as a consequence of the low rate of favorable mutations expected in multiple trait systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The study of continuously varying, quantitative traits is important in evolutionary biology, agriculture, and medicine. Variation in such traits is attributable to many, possibly interacting, genes whose expression may be sensitive to the environment, which makes their dissection into underlying causative factors difficult. An important population parameter for quantitative traits is heritability, the proportion of total variance that is due to genetic factors. Response to artificial and natural selection and the degree of resemblance between relatives are all a function of this parameter. Following the classic paper by R. A. Fisher in 1918, the estimation of additive and dominance genetic variance and heritability in populations is based upon the expected proportion of genes shared between different types of relatives, and explicit, often controversial and untestable models of genetic and non-genetic causes of family resemblance. With genome-wide coverage of genetic markers it is now possible to estimate such parameters solely within families using the actual degree of identity-by-descent sharing between relatives. Using genome scans on 4,401 quasi-independent sib pairs of which 3,375 pairs had phenotypes, we estimated the heritability of height from empirical genome-wide identity-by-descent sharing, which varied from 0.374 to 0.617 (mean 0.498, standard deviation 0.036). The variance in identity-by-descent sharing per chromosome and per genome was consistent with theory. The maximum likelihood estimate of the heritability for height was 0.80 with no evidence for non-genetic causes of sib resemblance, consistent with results from independent twin and family studies but using an entirely separate source of information. Our application shows that it is feasible to estimate genetic variance solely from within- family segregation and provides an independent validation of previously untestable assumptions. Given sufficient data, our new paradigm will allow the estimation of genetic variation for disease susceptibility and quantitative traits that is free from confounding with non-genetic factors and will allow partitioning of genetic variation into additive and non-additive components.