26 resultados para Markov chains hidden Markov models Viterbi algorithm Forward-Backward algorithm maximum likelihood
Resumo:
To obtain a better understanding of the associations among Borderline Personality Disorder (BPD), adult attachment patterns, impulsivity, and aggressiveness, we tested four competing models of these relationships: a) BPD is associated with the personality traits of impulsivity and aggressiveness, but adult attachment patterns predict neither BPD nor impulsive/aggressive features; b) adult attachment patterns are significant predictors of BPD but not of impulsive/aggressive traits, although these traits correlate with BPD; c) adult attachment patterns are significant predictors of impulsive and aggressive traits, which in turn predict BPD; and d) adult attachment patterns significantly predict both BPD and impulsive/aggressive traits. We assessed 466 consecutively admitted outpatients using the Structured Clinical Interview for DSM-IV Axis II Personality Disorders (V. 2.0), the Attachment Style Questionnaire, the Barratt Impulsiveness Scale-11, and the Aggression Questionnaire. Maximum likelihood structural equation modeling of the covariance matrix showed that model (c) was the best fitting model (chi(2) (21) = 31.67, p >.05, RMSEA = .023, test of close fit p >.85). This result indicates that adult attachment patterns act indirectly as risk factors for BPD because of their relationships with aggressive/impulsive personality traits.
Resumo:
We have used the Two-Degree Field (2dF) instrument on the Anglo-Australian Telescope (AAT) to obtain redshifts of a sample of z < 3 and 18.0 < g < 21.85 quasars selected from Sloan Digital Sky Survey (SDSS) imaging. These data are part of a larger joint programme between the SDSS and 2dF communities to obtain spectra of faint quasars and luminous red galaxies, namely the 2dF-SDSS LRG and QSO (2SLAQ) Survey. We describe the quasar selection algorithm and present the resulting number counts and luminosity function of 5645 quasars in 105.7 deg(2). The bright-end number counts and luminosity functions agree well with determinations from the 2dF QSO Redshift Survey (2QZ) data to g similar to 20.2. However, at the faint end, the 2SLAQ number counts and luminosity functions are steeper (i.e. require more faint quasars) than the final 2QZ results from Croom et al., but are consistent with the preliminary 2QZ results from Boyle et al. Using the functional form adopted for the 2QZ analysis ( a double power law with pure luminosity evolution characterized by a second-order polynomial in redshift), we find a faint-end slope of beta =-1.78 +/- 0.03 if we allow all of the parameters to vary, and beta =-1.45 +/- 0.03 if we allow only the faint-end slope and normalization to vary (holding all other parameters equal to the final 2QZ values). Over the magnitude range covered by the 2SLAQ survey, our maximum-likelihood fit to the data yields 32 per cent more quasars than the final 2QZ parametrization, but is not inconsistent with other g > 21 deep surveys for quasars. The 2SLAQ data exhibit no well-defined 'break' in the number counts or luminosity function, but do clearly flatten with increasing magnitude. Finally, we find that the shape of the quasar luminosity function derived from 2SLAQ is in good agreement with that derived from Type I quasars found in hard X-ray surveys.
Resumo:
The study of continuously varying, quantitative traits is important in evolutionary biology, agriculture, and medicine. Variation in such traits is attributable to many, possibly interacting, genes whose expression may be sensitive to the environment, which makes their dissection into underlying causative factors difficult. An important population parameter for quantitative traits is heritability, the proportion of total variance that is due to genetic factors. Response to artificial and natural selection and the degree of resemblance between relatives are all a function of this parameter. Following the classic paper by R. A. Fisher in 1918, the estimation of additive and dominance genetic variance and heritability in populations is based upon the expected proportion of genes shared between different types of relatives, and explicit, often controversial and untestable models of genetic and non-genetic causes of family resemblance. With genome-wide coverage of genetic markers it is now possible to estimate such parameters solely within families using the actual degree of identity-by-descent sharing between relatives. Using genome scans on 4,401 quasi-independent sib pairs of which 3,375 pairs had phenotypes, we estimated the heritability of height from empirical genome-wide identity-by-descent sharing, which varied from 0.374 to 0.617 (mean 0.498, standard deviation 0.036). The variance in identity-by-descent sharing per chromosome and per genome was consistent with theory. The maximum likelihood estimate of the heritability for height was 0.80 with no evidence for non-genetic causes of sib resemblance, consistent with results from independent twin and family studies but using an entirely separate source of information. Our application shows that it is feasible to estimate genetic variance solely from within- family segregation and provides an independent validation of previously untestable assumptions. Given sufficient data, our new paradigm will allow the estimation of genetic variation for disease susceptibility and quantitative traits that is free from confounding with non-genetic factors and will allow partitioning of genetic variation into additive and non-additive components.
Resumo:
Motivation: The clustering of gene profiles across some experimental conditions of interest contributes significantly to the elucidation of unknown gene function, the validation of gene discoveries and the interpretation of biological processes. However, this clustering problem is not straightforward as the profiles of the genes are not all independently distributed and the expression levels may have been obtained from an experimental design involving replicated arrays. Ignoring the dependence between the gene profiles and the structure of the replicated data can result in important sources of variability in the experiments being overlooked in the analysis, with the consequent possibility of misleading inferences being made. We propose a random-effects model that provides a unified approach to the clustering of genes with correlated expression levels measured in a wide variety of experimental situations. Our model is an extension of the normal mixture model to account for the correlations between the gene profiles and to enable covariate information to be incorporated into the clustering process. Hence the model is applicable to longitudinal studies with or without replication, for example, time-course experiments by using time as a covariate, and to cross-sectional experiments by using categorical covariates to represent the different experimental classes. Results: We show that our random-effects model can be fitted by maximum likelihood via the EM algorithm for which the E(expectation) and M(maximization) steps can be implemented in closed form. Hence our model can be fitted deterministically without the need for time-consuming Monte Carlo approximations. The effectiveness of our model-based procedure for the clustering of correlated gene profiles is demonstrated on three real datasets, representing typical microarray experimental designs, covering time-course, repeated-measurement and cross-sectional data. In these examples, relevant clusters of the genes are obtained, which are supported by existing gene-function annotation. A synthetic dataset is considered too.
Resumo:
Objective: Inpatient length of stay (LOS) is an important measure of hospital activity, health care resource consumption, and patient acuity. This research work aims at developing an incremental expectation maximization (EM) based learning approach on mixture of experts (ME) system for on-line prediction of LOS. The use of a batchmode learning process in most existing artificial neural networks to predict LOS is unrealistic, as the data become available over time and their pattern change dynamically. In contrast, an on-line process is capable of providing an output whenever a new datum becomes available. This on-the-spot information is therefore more useful and practical for making decisions, especially when one deals with a tremendous amount of data. Methods and material: The proposed approach is illustrated using a real example of gastroenteritis LOS data. The data set was extracted from a retrospective cohort study on all infants born in 1995-1997 and their subsequent admissions for gastroenteritis. The total number of admissions in this data set was n = 692. Linked hospitalization records of the cohort were retrieved retrospectively to derive the outcome measure, patient demographics, and associated co-morbidities information. A comparative study of the incremental learning and the batch-mode learning algorithms is considered. The performances of the learning algorithms are compared based on the mean absolute difference (MAD) between the predictions and the actual LOS, and the proportion of predictions with MAD < 1 day (Prop(MAD < 1)). The significance of the comparison is assessed through a regression analysis. Results: The incremental learning algorithm provides better on-line prediction of LOS when the system has gained sufficient training from more examples (MAD = 1.77 days and Prop(MAD < 1) = 54.3%), compared to that using the batch-mode learning. The regression analysis indicates a significant decrease of MAD (p-value = 0.063) and a significant (p-value = 0.044) increase of Prop(MAD
Resumo:
Determining the dimensionality of G provides an important perspective on the genetic basis of a multivariate suite of traits. Since the introduction of Fisher's geometric model, the number of genetically independent traits underlying a set of functionally related phenotypic traits has been recognized as an important factor influencing the response to selection. Here, we show how the effective dimensionality of G can be established, using a method for the determination of the dimensionality of the effect space from a multivariate general linear model introduced by AMEMIYA (1985). We compare this approach with two other available methods, factor-analytic modeling and bootstrapping, using a half-sib experiment that estimated G for eight cuticular hydrocarbons of Drosophila serrata. In our example, eight pheromone traits were shown to be adequately represented by only two underlying genetic dimensions by Amemiya's approach and factor-analytic modeling of the covariance structure at the sire level. In, contrast, bootstrapping identified four dimensions with significant genetic variance. A simulation study indicated that while the performance of Amemiya's method was more sensitive to power constraints, it performed as well or better than factor-analytic modeling in correctly identifying the original genetic dimensions at moderate to high levels of heritability. The bootstrap approach consistently overestimated the number of dimensions in all cases and performed less well than Amemiya's method at subspace recovery.
Resumo:
Count data with excess zeros relative to a Poisson distribution are common in many biomedical applications. A popular approach to the analysis of such data is to use a zero-inflated Poisson (ZIP) regression model. Often, because of the hierarchical Study design or the data collection procedure, zero-inflation and lack of independence may occur simultaneously, which tender the standard ZIP model inadequate. To account for the preponderance of zero counts and the inherent correlation of observations, a class of multi-level ZIP regression model with random effects is presented. Model fitting is facilitated using an expectation-maximization algorithm, whereas variance components are estimated via residual maximum likelihood estimating equations. A score test for zero-inflation is also presented. The multi-level ZIP model is then generalized to cope with a more complex correlation structure. Application to the analysis of correlated count data from a longitudinal infant feeding study illustrates the usefulness of the approach.
Resumo:
The capability of cricket batsmen of different skill levels to pick-up information from the pre-release movement pattern of the bowler, from pre-bounce ball flight, and from post-bounce ball flight was examined experimentally. Six highly skilled and six low-skilled cricket batsmen batted against three different leg-spin bowlers while wearing liquid crystal spectacles. The spectacles permitted the specific information available to the batsmen on each trial to be manipulated such that vision was either: (i) occluded at a point prior to the point of ball release (thereby only allowing vision of advance information from the bowler's delivery action); (ii) occluded at a point prior to the point of bat[ bounce (thereby permitting the additional vision of pre-bounce ball flight); or (iii) not occluded (thereby permitting the additional vision of post-bounce bat[ flight information). Measurement was made on each trial of both the accuracy of the definitive (forward-backward) foot movements made by the batsmen and their success (or otherwise) in making bat-bat[ contact. The analyses revealed a superior capability of the more skilled players to make use of earlier (pre-bounce) bat[ flight information to guide successful bat-bat[ interception, thus mirroring the greater use of prospective information pick-up by skilled performers observed in other aspects of batting and in other time-constrained performance domains. (c) 2006 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Resumo:
Many variables that are of interest in social science research are nominal variables with two or more categories, such as employment status, occupation, political preference, or self-reported health status. With longitudinal survey data it is possible to analyse the transitions of individuals between different employment states or occupations (for example). In the statistical literature, models for analysing categorical dependent variables with repeated observations belong to the family of models known as generalized linear mixed models (GLMMs). The specific GLMM for a dependent variable with three or more categories is the multinomial logit random effects model. For these models, the marginal distribution of the response does not have a closed form solution and hence numerical integration must be used to obtain maximum likelihood estimates for the model parameters. Techniques for implementing the numerical integration are available but are computationally intensive requiring a large amount of computer processing time that increases with the number of clusters (or individuals) in the data and are not always readily accessible to the practitioner in standard software. For the purposes of analysing categorical response data from a longitudinal social survey, there is clearly a need to evaluate the existing procedures for estimating multinomial logit random effects model in terms of accuracy, efficiency and computing time. The computational time will have significant implications as to the preferred approach by researchers. In this paper we evaluate statistical software procedures that utilise adaptive Gaussian quadrature and MCMC methods, with specific application to modeling employment status of women using a GLMM, over three waves of the HILDA survey.
Resumo:
We present a novel, maximum-likelihood (ML), lattice-decoding algorithm for noncoherent block detection of QAM signals. The computational complexity is polynomial in the block length; making it feasible for implementation compared with the exhaustive search ML detector. The algorithm works by enumerating the nearest neighbor regions for a plane defined by the received vector; in a conceptually similar manner to sphere decoding. Simulations show that the new algorithm significantly outperforms existing approaches