972 resultados para Generalized expectation-maximization algorithm
Resumo:
Frequencies of meiotic configurations in cytogenetic stocks are dependent on chiasma frequencies in segments defined by centromeres, breakpoints, and telomeres. The expectation maximization algorithm is proposed as a general method to perform maximum likelihood estimations of the chiasma frequencies in the intervals between such locations. The estimates can be translated via mapping functions into genetic maps of cytogenetic landmarks. One set of observational data was analyzed to exemplify application of these methods, results of which were largely concordant with other comparable data. The method was also tested by Monte Carlo simulation of frequencies of meiotic configurations from a monotelodisomic translocation heterozygote, assuming six different sample sizes. The estimate averages were always close to the values given initially to the parameters. The maximum likelihood estimation procedures can be extended readily to other kinds of cytogenetic stocks and allow the pooling of diverse cytogenetic data to collectively estimate lengths of segments, arms, and chromosomes.
Resumo:
Count data with excess zeros relative to a Poisson distribution are common in many biomedical applications. A popular approach to the analysis of such data is to use a zero-inflated Poisson (ZIP) regression model. Often, because of the hierarchical Study design or the data collection procedure, zero-inflation and lack of independence may occur simultaneously, which tender the standard ZIP model inadequate. To account for the preponderance of zero counts and the inherent correlation of observations, a class of multi-level ZIP regression model with random effects is presented. Model fitting is facilitated using an expectation-maximization algorithm, whereas variance components are estimated via residual maximum likelihood estimating equations. A score test for zero-inflation is also presented. The multi-level ZIP model is then generalized to cope with a more complex correlation structure. Application to the analysis of correlated count data from a longitudinal infant feeding study illustrates the usefulness of the approach.
Resumo:
Distributed network utility maximization (NUM) is receiving increasing interests for cross-layer optimization problems in multihop wireless networks. Traditional distributed NUM algorithms rely heavily on feedback information between different network elements, such as traffic sources and routers. Because of the distinct features of multihop wireless networks such as time-varying channels and dynamic network topology, the feedback information is usually inaccurate, which represents as a major obstacle for distributed NUM application to wireless networks. The questions to be answered include if distributed NUM algorithm can converge with inaccurate feedback and how to design effective distributed NUM algorithm for wireless networks. In this paper, we first use the infinitesimal perturbation analysis technique to provide an unbiased gradient estimation on the aggregate rate of traffic sources at the routers based on locally available information. On the basis of that, we propose a stochastic approximation algorithm to solve the distributed NUM problem with inaccurate feedback. We then prove that the proposed algorithm can converge to the optimum solution of distributed NUM with perfect feedback under certain conditions. The proposed algorithm is applied to the joint rate and media access control problem for wireless networks. Numerical results demonstrate the convergence of the proposed algorithm. © 2013 John Wiley & Sons, Ltd.
Resumo:
A new generalized sphere decoding algorithm is proposed for underdetermined MIMO systems with fewer receive antennas N than transmit antennas M. The proposed algorithm is significantly faster than the existing generalized sphere decoding algorithms. The basic idea is to partition the transmitted signal vector into two subvectors x and x with N - 1 and M - N + 1 elements respectively. After some simple transformations, an outer layer Sphere Decoder (SD) can be used to choose proper x and then use an inner layer SD to decide x, thus the whole transmitted signal vector is obtained. Simulation results show that Double Layer Sphere Decoding (DLSD) has far less complexity than the existing Generalized Sphere Decoding (GSDs).
Resumo:
Distributed network utility maximization (NUM) is receiving increasing interests for cross-layer optimization problems in multihop wireless networks. Traditional distributed NUM algorithms rely heavily on feedback information between different network elements, such as traffic sources and routers. Because of the distinct features of multihop wireless networks such as time-varying channels and dynamic network topology, the feedback information is usually inaccurate, which represents as a major obstacle for distributed NUM application to wireless networks. The questions to be answered include if distributed NUM algorithm can converge with inaccurate feedback and how to design effective distributed NUM algorithm for wireless networks. In this paper, we first use the infinitesimal perturbation analysis technique to provide an unbiased gradient estimation on the aggregate rate of traffic sources at the routers based on locally available information. On the basis of that, we propose a stochastic approximation algorithm to solve the distributed NUM problem with inaccurate feedback. We then prove that the proposed algorithm can converge to the optimum solution of distributed NUM with perfect feedback under certain conditions. The proposed algorithm is applied to the joint rate and media access control problem for wireless networks. Numerical results demonstrate the convergence of the proposed algorithm. © 2013 John Wiley & Sons, Ltd.
Resumo:
We work on the research of a zero of a maximal monotone operator on a real Hilbert space. Following the recent progress made in the context of the proximal point algorithm devoted to this problem, we introduce simultaneously a variable metric and a kind of relaxation in the perturbed Tikhonov’s algorithm studied by P. Tossings. So, we are led to work in the context of the variational convergence theory.
Resumo:
Background: Interleukin 8 (IL-8) is a chemokine related to the initiation and amplification of acute and chronic inflammatory processes. Polymorphisms in the IL8 gene have been associated with inflammatory diseases. We investigated whether the - 845(T/C) and - 738(T/A) single nucleotide polymorphisms (SNPs) in the IL8 gene, as well as the haplotypes they form together with the previously investigated -353(A/T), are associated with susceptibility to chronic periodontitis. Methods: DNA was extracted from buccal epithelial cells of 400 Brazilian individuals (control n =182, periodontitis n=218). SNPs were genotyped by the polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) method. Disease associations were analyzed by the chi(2) test, Exact Fisher test and Clump program. Haplotypes were reconstructed using the expectation-maximization algorithm and differences in haplotype distribution between the groups were analyzed to estimate genetic susceptibility for chronic periodontitis development. Results: When analyzed individually, no SNPs showed different distributions between the control and chronic periodontitis groups. Although, nonsmokers carrying the TTA/CAT (OR = 2.35, 95% CI = 1.03-5.36) and TAT/CTA (OR= 6.05, 95% CI = 1.32-27.7) haplotypes were genetically susceptible to chronic periodontitis. The ITT/TAA haplotype was associated with protection against the development of periodontitis (for nonsmokers OR= 0.22, 95% CI = 0.10-0.46). Conclusion: Although none of the investigated SNPs in the IL8 gene was individually associated with periodontitis, some haplotypes showed significant association with susceptibility to, or protection against, chronic periodontitis in a Brazilian population. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
When examining a rock mass, joint sets and their orientations can play a significant role with regard to how the rock mass will behave. To identify joint sets present in the rock mass, the orientation of individual fracture planer can be measured on exposed rock faces and the resulting data can be examined for heterogeneity. In this article, the expectation-maximization algorithm is used to lit mixtures of Kent component distributions to the fracture data to aid in the identification of joint sets. An additional uniform component is also included in the model to accommodate the noise present in the data.
Resumo:
Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.
Resumo:
Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.
Resumo:
In this thesis we implement estimating procedures in order to estimate threshold parameters for the continuous time threshold models driven by stochastic di®erential equations. The ¯rst procedure is based on the EM (expectation-maximization) algorithm applied to the threshold model built from the Brownian motion with drift process. The second procedure mimics one of the fundamental ideas in the estimation of the thresholds in time series context, that is, conditional least squares estimation. We implement this procedure not only for the threshold model built from the Brownian motion with drift process but also for more generic models as the ones built from the geometric Brownian motion or the Ornstein-Uhlenbeck process. Both procedures are implemented for simu- lated data and the least squares estimation procedure is also implemented for real data of daily prices from a set of international funds. The ¯rst fund is the PF-European Sus- tainable Equities-R fund from the Pictet Funds company and the second is the Parvest Europe Dynamic Growth fund from the BNP Paribas company. The data for both funds are daily prices from the year 2004. The last fund to be considered is the Converging Europe Bond fund from the Schroder company and the data are daily prices from the year 2005.
Resumo:
RESUMO: Introdução: A espondilite anquilosante (EA) é uma doença inflamatória crónica caracterizada pela inflamação das articulações sacroilíacas e da coluna. A anquilose progressiva motiva uma deterioração gradual da função física e da qualidade de vida. O diagnóstico e o tratamento precoces podem contribuir para um melhor prognóstico. Neste contexto, a identificação de biomarcadores, assume-se como sendo muito útil para a prática clínica e representa hoje um grande desafio para a comunidade científica. Objetivos: Este estudo teve como objetivos: 1 - caracterizar a EA em Portugal; 2 - investigar possíveis associações entre genes, MHC e não-MHC, com a suscetibilidade e as características fenotípicas da EA; 3 - identificar genes candidatos associados a EA através da tecnologia de microarray. Material e Métodos: Foram recrutados doentes com EA, de acordo com os critérios modificados de Nova Iorque, nas consultas de Reumatologia dos diferentes hospitais participantes. Colecionaram-se dados demográficos, clínicos e radiológicos e colhidas amostras de sangue periférico. Selecionaram-se de forma aleatória, doentes HLA-B27 positivos, os quais foram tipados em termos de HLA classe I e II por PCR-rSSOP. Os haplótipos HLA estendidos foram estimados pelo algoritmo Expectation Maximization com recurso ao software Arlequin v3.11. As variantes alélicas dos genes IL23R, ERAP1 e ANKH foram estudadas através de ensaios de discriminação alélica TaqMan. A análise de associação foi realizada utilizando testes da Cochrane-Armitage e de regressão linear, tal como implementado pelo PLINK, para variáveis qualitativas e quantitativas, respetivamente. O estudo de expressão génica foi realizado por Illumina HT-12 Whole-Genome Expression BeadChips. Os genes candidatos foram validados usando qPCR-based TaqMan Low Density Arrays (TLDAs). Resultados: Foram incluídos 369 doentes (62,3% do sexo masculino, com idade média de 45,4 ± 13,2 anos, duração média da doença de 11,4 ± 10,5 anos). No momento da avaliação, 49,9% tinham doença axial, 2,4% periférica, 40,9% mista e 7,1% entesopática. A uveíte anterior aguda (33,6%) foi a manifestação extra-articular mais comum. Foram positivos para o HLA-B27, 80,3% dos doentes. Os haplótipo A*02/B*27/Cw*02/DRB1*01/DQB1*05 parece conferir suscetibilidade para a EA, e o A*02/B*27/Cw*01/DRB1*08/DQB1*04 parece conferir proteção em termos de atividade, repercussão funcional e radiológica da doença. Três variantes (2 para IL23R e 1 para ERAP1) mostraram significativa associação com a doença, confirmando a associação destes genes com a EA na população Portuguesa. O mesmo não se verificou com as variantes estudadas do ANKH. Não se verificou associação entre as variantes génicas não-MHC e as manifestações clínicas da EA. Foi identificado um perfil de expressão génica para a EA, tendo sido validados catorze genes - alguns têm um papel bem documentado em termos de inflamação, outros no metabolismo da cartilagem e do osso. Conclusões: Foi estabelecido um perfil demográfico e clínico dos doentes com EA em Portugal. A identificação de variantes génicas e de um perfil de expressão contribuem para uma melhor compreensão da sua fisiopatologia e podem ser úteis para estabelecer modelos com relevância em termos de diagnóstico, prognóstico e orientação terapêutica dos doentes. -----------ABSTRACT: Background: Ankylosing Spondylitis (AS) is a chronic inflammatory disorder characterized by inflammation in the spine and sacroiliac joints leading to progressive joint ankylosis and in progressive deterioration of physical function and quality of life. An early diagnosis and early therapy may contribute to a better prognosis. The identification of biomarkers would be helpful and represents a great challenge for the scientific community. Objectives: The present study had the following aims: 1- to characterize the pattern of AS in Portuguese patients; 2- to investigate MHC and non-MHC gene associations with susceptibility and phenotypic features of AS and; 3- to identify candidate genes associated with AS by means of whole-genome microarray. Material and Methods: AS was defined in accordance to the modified New York criteria and AS cases were recruited from hospital outcares patient clinics. Demographic and clinical data were recorded and blood samples collected. A random group of HLA-B27 positive patients and controls were selected and typed for HLA class I and II by PCR-rSSOP. The extended HLA haplotypes were estimated by Expectation Maximization Algorithm using Arlequin v3.11 software. Genotyping of IL23R, ERAP1 and ANKH allelic variants was carried out with TaqMan allelic discrimination assays. Association analysis was performed using the Cochrane-Armitage and linear regression tests as implemented in PLINK, for dichotomous and quantitative variables, respectively. Gene expression profile was carried out using Illumina HT-12 Whole-Genome Expression BeadChips and candidate genes were validated using qPCR-based TaqMan Low Density Arrays (TLDAs). Results: A total of 369 patients (62.3% male; mean age 45.4±13.2 years; mean disease duration 11.4±10.5 years), were included. Regarding clinical disease pattern, at the time of assessment, 49.9% had axial disease, 2.4% peripheral disease, 40.9% mixed disease and 7.1% isolated enthesopathic disease. Acute anterior uveitis (33.6%) was the most common extra-articular manifestation. 80.3% of AS patients were HLA-B27 positive. The haplotype A*02/B*27/Cw*02/DRB1*01/DQB1*05 seems to confer susceptibility to AS, whereas A*02/B*27/Cw*01/DRB1*08/DQB1*04 seems to provide protection in terms of disease activity, functional and radiological repercussion. Three markers (two for IL23R and one for ERAP1) showed significant single-locus disease associations. Association of these genes with AS in the Portuguese population was confirmed, whereas ANKH markers studied did not show an association with AS. No association was seen between non-MHC genes and clinical manifestations of AS. A gene expression signature for AS was established; among the fourteen validated genes, a number of them have a well-documented inflammatory role or in modulation of cartilage and bone metabolism. Conclusions: A demographic and clinical profile of patients with AS in Portugal was established. Identification of genetic variants of target genes as well as gene expression signatures could provide a better understanding of AS pathophysiology and could be useful to establish models with relevance in terms of susceptibility, prognosis, and potential therapeutic guidance.
Resumo:
Given the very large amount of data obtained everyday through population surveys, much of the new research again could use this information instead of collecting new samples. Unfortunately, relevant data are often disseminated into different files obtained through different sampling designs. Data fusion is a set of methods used to combine information from different sources into a single dataset. In this article, we are interested in a specific problem: the fusion of two data files, one of which being quite small. We propose a model-based procedure combining a logistic regression with an Expectation-Maximization algorithm. Results show that despite the lack of data, this procedure can perform better than standard matching procedures.
Resumo:
BACKGROUND: The clinical course of HIV-1 infection is highly variable among individuals, at least in part as a result of genetic polymorphisms in the host. Toll-like receptors (TLRs) have a key role in innate immunity and mutations in the genes encoding these receptors have been associated with increased or decreased susceptibility to infections. OBJECTIVES: To determine whether single-nucleotide polymorphisms (SNPs) in TLR2-4 and TLR7-9 influenced the natural course of HIV-1 infection. METHODS: Twenty-eight SNPs in TLRs were analysed in HAART-naive HIV-positive patients from the Swiss HIV Cohort Study. The SNPs were detected using Sequenom technology. Haplotypes were inferred using an expectation-maximization algorithm. The CD4 T cell decline was calculated using a least-squares regression. Patients with a rapid CD4 cell decline, less than the 15th percentile, were defined as rapid progressors. The risk of rapid progression associated with SNPs was estimated using a logistic regression model. Other candidate risk factors included age, sex and risk groups (heterosexual, homosexual and intravenous drug use). RESULTS: Two SNPs in TLR9 (1635A/G and +1174G/A) in linkage disequilibrium were associated with the rapid progressor phenotype: for 1635A/G, odds ratio (OR), 3.9 [95% confidence interval (CI),1.7-9.2] for GA versus AA and OR, 4.7 (95% CI,1.9-12.0) for GG versus AA (P = 0.0008). CONCLUSION: Rapid progression of HIV-1 infection was associated with TLR9 polymorphisms. Because of its potential implications for intervention strategies and vaccine developments, additional epidemiological and experimental studies are needed to confirm this association.
Resumo:
Nous y introduisons une nouvelle classe de distributions bivariées de type Marshall-Olkin, la distribution Erlang bivariée. La transformée de Laplace, les moments et les densités conditionnelles y sont obtenus. Les applications potentielles en assurance-vie et en finance sont prises en considération. Les estimateurs du maximum de vraisemblance des paramètres sont calculés par l'algorithme Espérance-Maximisation. Ensuite, notre projet de recherche est consacré à l'étude des processus de risque multivariés, qui peuvent être utiles dans l'étude des problèmes de la ruine des compagnies d'assurance avec des classes dépendantes. Nous appliquons les résultats de la théorie des processus de Markov déterministes par morceaux afin d'obtenir les martingales exponentielles, nécessaires pour établir des bornes supérieures calculables pour la probabilité de ruine, dont les expressions sont intraitables.