940 resultados para maximum likelihood analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The major cause of athlete's foot is Trichophyton rubrum, a dermatophyte or fungal pathogen of human skin. To facilitate molecular analyses of the dermatophytes, we sequenced T. rubrum and four related species, Trichophyton tonsurans, Trichophyton equinum, Microsporum canis, and Microsporum gypseum. These species differ in host range, mating, and disease progression. The dermatophyte genomes are highly colinear yet contain gene family expansions not found in other human-associated fungi. Dermatophyte genomes are enriched for gene families containing the LysM domain, which binds chitin and potentially related carbohydrates. These LysM domains differ in sequence from those in other species in regions of the peptide that could affect substrate binding. The dermatophytes also encode novel sets of fungus-specific kinases with unknown specificity, including nonfunctional pseudokinases, which may inhibit phosphorylation by competing for kinase sites within substrates, acting as allosteric effectors, or acting as scaffolds for signaling. The dermatophytes are also enriched for a large number of enzymes that synthesize secondary metabolites, including dermatophyte-specific genes that could synthesize novel compounds. Finally, dermatophytes are enriched in several classes of proteases that are necessary for fungal growth and nutrient acquisition on keratinized tissues. Despite differences in mating ability, genes involved in mating and meiosis are conserved across species, suggesting the possibility of cryptic mating in species where it has not been previously detected. These genome analyses identify gene families that are important to our understanding of how dermatophytes cause chronic infections, how they interact with epithelial cells, and how they respond to the host immune response. IMPORTANCE Athlete's foot, jock itch, ringworm, and nail infections are common fungal infections, all caused by fungi known as dermatophytes (fungi that infect skin). This report presents the genome sequences of Trichophyton rubrum, the most frequent cause of athlete's foot, as well as four other common dermatophytes. Dermatophyte genomes are enriched for four gene classes that may contribute to the ability of these fungi to cause disease. These include (i) proteases secreted to degrade skin; (ii) kinases, including pseudokinases, that are involved in signaling necessary for adapting to skin; (iii) secondary metabolites, compounds that act as toxins or signals in the interactions between fungus and host; and (iv) a class of proteins (LysM) that appear to bind and mask cell wall components and carbohydrates, thus avoiding the host's immune response to the fungi. These genome sequences provide a strong foundation for future work in understanding how dermatophytes cause disease.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bovine coronavirus has been associated with diarrhoea in newborn calves, winter dysentery in adult cattle and respiratory tract infections in calves and feedlot cattle. In Cuba, the presence of BCoV was first reported in 2006. Since then, sporadic outbreaks have continued to occur. This study was aimed at deepening the knowledge of the evolution, molecular markers of virulence and epidemiology of BCoV in Cuba. A total of 30 samples collected between 2009 and 2011 were used for PCR amplification and direct sequencing of partial or full S gene. Sequence comparison and phylogenetic studies were conducted using partial or complete S gene sequences as phylogenetic markers. All Cuban bovine coronavirus sequences were located in a single cluster supported by 100% bootstrap and 1.00 posterior probability values. The Cuban bovine coronavirus sequences were also clustered with the USA BCoV strains corresponding to the GenBank accession numbers EF424621 and EF424623, suggesting a common origin for these viruses. This phylogenetic cluster was also the only group of sequences in which no recombination events were detected. Of the 45 amino acid changes found in the Cuban strains, four were unique. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The HIV-1 subtype C has spread efficiently in the southern states of Brazil (Rio Grande do Sul, Santa Catarina and Parana). Phylogeographic studies indicate that the subtype C epidemic in southern Brazil was initiated by the introduction of a single founder virus population at some time point between 1960 and 1980, but little is known about the spatial dynamics of viral spread. A total of 135 Brazilian HIV-1 subtype C pol sequences collected from 1992 to 2009 at the three southern state capitals (Porto Alegre, Florianopolis and Curitiba) were analyzed. Maximum-likelihood and Bayesian methods were used to explore the degree of phylogenetic mixing of subtype C sequences from different cities and to reconstruct the geographical pattern of viral spread in this country region. Phylogeographic analyses supported the monophyletic origin of the HIV-1 subtype C clade circulating in southern Brazil and placed the root of that clade in Curitiba (Parana state). This analysis further suggested that Florianopolis (Santa Catarina state) is an important staging post in the subtype C dissemination displaying high viral migration rates from and to the other cities, while viral flux between Curitiba and Porto Alegre (Rio Grande do Sul state) is very low. We found a positive correlation (r(2) = 0.64) between routine travel and viral migration rates among localities. Despite the intense viral movement, phylogenetic intermixing of subtype C sequences from different Brazilian cities is lower than expected by chance. Notably, a high proportion (67%) of subtype C sequences from Porto Alegre branched within a single local monophyletic sub-cluster. These results suggest that the HIV-1 subtype C epidemic in southern Brazil has been shaped by both frequent viral migration among states and in situ dissemination of local clades.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis introduces new processing techniques for computer-aided interpretation of ultrasound images with the purpose of supporting medical diagnostic. In terms of practical application, the goal of this work is the improvement of current prostate biopsy protocols by providing physicians with a visual map overlaid over ultrasound images marking regions potentially affected by disease. As far as analysis techniques are concerned, the main contributions of this work to the state-of-the-art is the introduction of deconvolution as a pre-processing step in the standard ultrasonic tissue characterization procedure to improve the diagnostic significance of ultrasonic features. This thesis also includes some innovations in ultrasound modeling, in particular the employment of a continuous-time autoregressive moving-average (CARMA) model for ultrasound signals, a new maximum-likelihood CARMA estimator based on exponential splines and the definition of CARMA parameters as new ultrasonic features able to capture scatterers concentration. Finally, concerning the clinical usefulness of the developed techniques, the main contribution of this research is showing, through a study based on medical ground truth, that a reduction in the number of sampled cores in standard prostate biopsy is possible, preserving the same diagnostic power of the current clinical protocol.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An extensive sample (2%) of private vehicles in Italy are equipped with a GPS device that periodically measures their position and dynamical state for insurance purposes. Having access to this type of data allows to develop theoretical and practical applications of great interest: the real-time reconstruction of traffic state in a certain region, the development of accurate models of vehicle dynamics, the study of the cognitive dynamics of drivers. In order for these applications to be possible, we first need to develop the ability to reconstruct the paths taken by vehicles on the road network from the raw GPS data. In fact, these data are affected by positioning errors and they are often very distanced from each other (~2 Km). For these reasons, the task of path identification is not straightforward. This thesis describes the approach we followed to reliably identify vehicle paths from this kind of low-sampling data. The problem of matching data with roads is solved with a bayesian approach of maximum likelihood. While the identification of the path taken between two consecutive GPS measures is performed with a specifically developed optimal routing algorithm, based on A* algorithm. The procedure was applied on an off-line urban data sample and proved to be robust and accurate. Future developments will extend the procedure to real-time execution and nation-wide coverage.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the present work we perform an econometric analysis of the Tribal art market. To this aim, we use a unique and original database that includes information on Tribal art market auctions worldwide from 1998 to 2011. In Literature, art prices are modelled through the hedonic regression model, a classic fixed-effect model. The main drawback of the hedonic approach is the large number of parameters, since, in general, art data include many categorical variables. In this work, we propose a multilevel model for the analysis of Tribal art prices that takes into account the influence of time on artwork prices. In fact, it is natural to assume that time exerts an influence over the price dynamics in various ways. Nevertheless, since the set of objects change at every auction date, we do not have repeated measurements of the same items over time. Hence, the dataset does not constitute a proper panel; rather, it has a two-level structure in that items, level-1 units, are grouped in time points, level-2 units. The main theoretical contribution is the extension of classical multilevel models to cope with the case described above. In particular, we introduce a model with time dependent random effects at the second level. We propose a novel specification of the model, derive the maximum likelihood estimators and implement them through the E-M algorithm. We test the finite sample properties of the estimators and the validity of the own-written R-code by means of a simulation study. Finally, we show that the new model improves considerably the fit of the Tribal art data with respect to both the hedonic regression model and the classic multilevel model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Analysis of recurrent events has been widely discussed in medical, health services, insurance, and engineering areas in recent years. This research proposes to use a nonhomogeneous Yule process with the proportional intensity assumption to model the hazard function on recurrent events data and the associated risk factors. This method assumes that repeated events occur for each individual, with given covariates, according to a nonhomogeneous Yule process with intensity function λx(t) = λ 0(t) · exp( x′β). One of the advantages of using a non-homogeneous Yule process for recurrent events is that it assumes that the recurrent rate is proportional to the number of events that occur up to time t. Maximum likelihood estimation is used to provide estimates of the parameters in the model, and a generalized scoring iterative procedure is applied in numerical computation. ^ Model comparisons between the proposed method and other existing recurrent models are addressed by simulation. One example concerning recurrent myocardial infarction events compared between two distinct populations, Mexican-American and Non-Hispanic Whites in the Corpus Christi Heart Project is examined. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In geographical epidemiology, maps of disease rates and disease risk provide a spatial perspective for researching disease etiology. For rare diseases or when the population base is small, the rate and risk estimates may be unstable. Empirical Bayesian (EB) methods have been used to spatially smooth the estimates by permitting an area estimate to "borrow strength" from its neighbors. Such EB methods include the use of a Gamma model, of a James-Stein estimator, and of a conditional autoregressive (CAR) process. A fully Bayesian analysis of the CAR process is proposed. One advantage of this fully Bayesian analysis is that it can be implemented simply by using repeated sampling from the posterior densities. Use of a Markov chain Monte Carlo technique such as Gibbs sampler was not necessary. Direct resampling from the posterior densities provides exact small sample inferences instead of the approximate asymptotic analyses of maximum likelihood methods (Clayton & Kaldor, 1987). Further, the proposed CAR model provides for covariates to be included in the model. A simulation demonstrates the effect of sample size on the fully Bayesian analysis of the CAR process. The methods are applied to lip cancer data from Scotland, and the results are compared. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The history of the logistic function since its introduction in 1838 is reviewed, and the logistic model for a polychotomous response variable is presented with a discussion of the assumptions involved in its derivation and use. Following this, the maximum likelihood estimators for the model parameters are derived along with a Newton-Raphson iterative procedure for evaluation. A rigorous mathematical derivation of the limiting distribution of the maximum likelihood estimators is then presented using a characteristic function approach. An appendix with theorems on the asymptotic normality of sample sums when the observations are not identically distributed, with proofs, supports the presentation on asymptotic properties of the maximum likelihood estimators. Finally, two applications of the model are presented using data from the Hypertension Detection and Follow-up Program, a prospective, population-based, randomized trial of treatment for hypertension. The first application compares the risk of five-year mortality from cardiovascular causes with that from noncardiovascular causes; the second application compares risk factors for fatal or nonfatal coronary heart disease with those for fatal or nonfatal stroke. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditional comparison of standardized mortality ratios (SMRs) can be misleading if the age-specific mortality ratios are not homogeneous. For this reason, a regression model has been developed which incorporates the mortality ratio as a function of age. This model is then applied to mortality data from an occupational cohort study. The nature of the occupational data necessitates the investigation of mortality ratios which increase with age. These occupational data are used primarily to illustrate and develop the statistical methodology.^ The age-specific mortality ratio (MR) for the covariates of interest can be written as MR(,ij...m) = ((mu)(,ij...m)/(theta)(,ij...m)) = r(.)exp (Z('')(,ij...m)(beta)) where (mu)(,ij...m) and (theta)(,ij...m) denote the force of mortality in the study and chosen standard populations in the ij...m('th) stratum, respectively, r is the intercept, Z(,ij...m) is the vector of covariables associated with the i('th) age interval, and (beta) is a vector of regression coefficients associated with these covariables. A Newton-Raphson iterative procedure has been used for determining the maximum likelihood estimates of the regression coefficients.^ This model provides a statistical method for a logical and easily interpretable explanation of an occupational cohort mortality experience. Since it gives a reasonable fit to the mortality data, it can also be concluded that the model is fairly realistic. The traditional statistical method for the analysis of occupational cohort mortality data is to present a summary index such as the SMR under the assumption of constant (homogeneous) age-specific mortality ratios. Since the mortality ratios for occupational groups usually increase with age, the homogeneity assumption of the age-specific mortality ratios is often untenable. The traditional method of comparing SMRs under the homogeneity assumption is a special case of this model, without age as a covariate.^ This model also provides a statistical technique to evaluate the relative risk between two SMRs or a dose-response relationship among several SMRs. The model presented has application in the medical, demographic and epidemiologic areas. The methods developed in this thesis are suitable for future analyses of mortality or morbidity data when the age-specific mortality/morbidity experience is a function of age or when there is an interaction effect between confounding variables needs to be evaluated. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is well known that an identification problem exists in the analysis of age-period-cohort data because of the relationship among the three factors (date of birth + age at death = date of death). There are numerous suggestions about how to analyze the data. No one solution has been satisfactory. The purpose of this study is to provide another analytic method by extending the Cox's lifetable regression model with time-dependent covariates. The new approach contains the following features: (1) It is based on the conditional maximum likelihood procedure using a proportional hazard function described by Cox (1972), treating the age factor as the underlying hazard to estimate the parameters for the cohort and period factors. (2) The model is flexible so that both the cohort and period factors can be treated as dummy or continuous variables, and the parameter estimations can be obtained for numerous combinations of variables as in a regression analysis. (3) The model is applicable even when the time period is unequally spaced.^ Two specific models are considered to illustrate the new approach and applied to the U.S. prostate cancer data. We find that there are significant differences between all cohorts and there is a significant period effect for both whites and nonwhites. The underlying hazard increases exponentially with age indicating that old people have much higher risk than young people. A log transformation of relative risk shows that the prostate cancer risk declined in recent cohorts for both models. However, prostate cancer risk declined 5 cohorts (25 years) earlier for whites than for nonwhites under the period factor model (0 0 0 1 1 1 1). These latter results are similar to the previous study by Holford (1983).^ The new approach offers a general method to analyze the age-period-cohort data without using any arbitrary constraint in the model. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this dissertation, we propose a continuous-time Markov chain model to examine the longitudinal data that have three categories in the outcome variable. The advantage of this model is that it permits a different number of measurements for each subject and the duration between two consecutive time points of measurements can be irregular. Using the maximum likelihood principle, we can estimate the transition probability between two time points. By using the information provided by the independent variables, this model can also estimate the transition probability for each subject. The Monte Carlo simulation method will be used to investigate the goodness of model fitting compared with that obtained from other models. A public health example will be used to demonstrate the application of this method. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pragmatism is the leading motivation of regularization. We can understand regularization as a modification of the maximum-likelihood estimator so that a reasonable answer could be given in an unstable or ill-posed situation. To mention some typical examples, this happens when fitting parametric or non-parametric models with more parameters than data or when estimating large covariance matrices. Regularization is usually used, in addition, to improve the bias-variance tradeoff of an estimation. Then, the definition of regularization is quite general, and, although the introduction of a penalty is probably the most popular type, it is just one out of multiple forms of regularization. In this dissertation, we focus on the applications of regularization for obtaining sparse or parsimonious representations, where only a subset of the inputs is used. A particular form of regularization, L1-regularization, plays a key role for reaching sparsity. Most of the contributions presented here revolve around L1-regularization, although other forms of regularization are explored (also pursuing sparsity in some sense). In addition to present a compact review of L1-regularization and its applications in statistical and machine learning, we devise methodology for regression, supervised classification and structure induction of graphical models. Within the regression paradigm, we focus on kernel smoothing learning, proposing techniques for kernel design that are suitable for high dimensional settings and sparse regression functions. We also present an application of regularized regression techniques for modeling the response of biological neurons. Supervised classification advances deal, on the one hand, with the application of regularization for obtaining a na¨ıve Bayes classifier and, on the other hand, with a novel algorithm for brain-computer interface design that uses group regularization in an efficient manner. Finally, we present a heuristic for inducing structures of Gaussian Bayesian networks using L1-regularization as a filter. El pragmatismo es la principal motivación de la regularización. Podemos entender la regularización como una modificación del estimador de máxima verosimilitud, de tal manera que se pueda dar una respuesta cuando la configuración del problema es inestable. A modo de ejemplo, podemos mencionar el ajuste de modelos paramétricos o no paramétricos cuando hay más parámetros que casos en el conjunto de datos, o la estimación de grandes matrices de covarianzas. Se suele recurrir a la regularización, además, para mejorar el compromiso sesgo-varianza en una estimación. Por tanto, la definición de regularización es muy general y, aunque la introducción de una función de penalización es probablemente el método más popular, éste es sólo uno de entre varias posibilidades. En esta tesis se ha trabajado en aplicaciones de regularización para obtener representaciones dispersas, donde sólo se usa un subconjunto de las entradas. En particular, la regularización L1 juega un papel clave en la búsqueda de dicha dispersión. La mayor parte de las contribuciones presentadas en la tesis giran alrededor de la regularización L1, aunque también se exploran otras formas de regularización (que igualmente persiguen un modelo disperso). Además de presentar una revisión de la regularización L1 y sus aplicaciones en estadística y aprendizaje de máquina, se ha desarrollado metodología para regresión, clasificación supervisada y aprendizaje de estructura en modelos gráficos. Dentro de la regresión, se ha trabajado principalmente en métodos de regresión local, proponiendo técnicas de diseño del kernel que sean adecuadas a configuraciones de alta dimensionalidad y funciones de regresión dispersas. También se presenta una aplicación de las técnicas de regresión regularizada para modelar la respuesta de neuronas reales. Los avances en clasificación supervisada tratan, por una parte, con el uso de regularización para obtener un clasificador naive Bayes y, por otra parte, con el desarrollo de un algoritmo que usa regularización por grupos de una manera eficiente y que se ha aplicado al diseño de interfaces cerebromáquina. Finalmente, se presenta una heurística para inducir la estructura de redes Bayesianas Gaussianas usando regularización L1 a modo de filtro.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents the Expectation Maximization algorithm (EM) applied to operational modal analysis of structures. The EM algorithm is a general-purpose method for maximum likelihood estimation (MLE) that in this work is used to estimate state space models. As it is well known, the MLE enjoys some optimal properties from a statistical point of view, which make it very attractive in practice. However, the EM algorithm has two main drawbacks: its slow convergence and the dependence of the solution on the initial values used. This paper proposes two different strategies to choose initial values for the EM algorithm when used for operational modal analysis: to begin with the parameters estimated by Stochastic Subspace Identification method (SSI) and to start using random points. The effectiveness of the proposed identification method has been evaluated through numerical simulation and measured vibration data in the context of a benchmark problem. Modal parameters (natural frequencies, damping ratios and mode shapes) of the benchmark structure have been estimated using SSI and the EM algorithm. On the whole, the results show that the application of the EM algorithm starting from the solution given by SSI is very useful to identify the vibration modes of a structure, discarding the spurious modes that appear in high order models and discovering other hidden modes. Similar results are obtained using random starting values, although this strategy allows us to analyze the solution of several starting points what overcome the dependence on the initial values used.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a time-domain stochastic system identification method based on Maximum Likelihood Estimation and the Expectation Maximization algorithm. The effectiveness of this structural identification method is evaluated through numerical simulation in the context of the ASCE benchmark problem on structural health monitoring. Modal parameters (eigenfrequencies, damping ratios and mode shapes) of the benchmark structure have been estimated applying the proposed identification method to a set of 100 simulated cases. The numerical results show that the proposed method estimates all the modal parameters reasonably well in the presence of 30% measurement noise even. Finally, advantages and disadvantages of the method have been discussed.