937 resultados para likelihood-based inference
Resumo:
Effective automatic summarization usually requires simulating human reasoning such as abstraction or relevance reasoning. In this paper we describe a solution for this type of reasoning in the particular case of surveillance of the behavior of a dynamic system using sensor data. The paper first presents the approach describing the required type of knowledge with a possible representation. This includes knowledge about the system structure, behavior, interpretation and saliency. Then, the paper shows the inference algorithm to produce a summarization tree based on the exploitation of the physical characteristics of the system. The paper illustrates how the method is used in the context of automatic generation of summaries of behavior in an application for basin surveillance in the presence of river floods.
Resumo:
Data-related properties of the activities involved in a service composition can be used to facilitate several design-time and run-time adaptation tasks, such as service evolution, distributed enactment, and instance-level adaptation. A number of these properties can be expressed using a notion of sharing. We present an approach for automated inference of data properties based on sharing analysis, which is able to handle service compositions with complex control structures, involving loops and sub-workflows. The properties inferred can include data dependencies, information content, domain-defined attributes, privacy or confidentiality levels, among others. The analysis produces characterizations of the data and the activities in the composition in terms of minimal and maximal sharing, which can then be used to verify compliance of potential adaptation actions, or as supporting information in their generation. This sharing analysis approach can be used both at design time and at run time. In the latter case, the results of analysis can be refined using the composition traces (execution logs) at the point of execution, in order to support run-time adaptation.
Resumo:
A new language recognition technique based on the application of the philosophy of the Shifted Delta Coefficients (SDC) to phone log-likelihood ratio features (PLLR) is described. The new methodology allows the incorporation of long-span phonetic information at a frame-by-frame level while dealing with the temporal length of each phone unit. The proposed features are used to train an i-vector based system and tested on the Albayzin LRE 2012 dataset. The results show a relative improvement of 33.3% in Cavg in comparison with different state-of-the-art acoustic i-vector based systems. On the other hand, the integration of parallel phone ASR systems where each one is used to generate multiple PLLR coefficients which are stacked together and then projected into a reduced dimension are also presented. Finally, the paper shows how the incorporation of state information from the phone ASR contributes to provide additional improvements and how the fusion with the other acoustic and phonotactic systems provides an important improvement of 25.8% over the system presented during the competition.
Resumo:
A novel pedestrian motion prediction technique is presented in this paper. Its main achievement regards to none previous observation, any knowledge of pedestrian trajectories nor the existence of possible destinations is required; hence making it useful for autonomous surveillance applications. Prediction only requires initial position of the pedestrian and a 2D representation of the scenario as occupancy grid. First, it uses the Fast Marching Method (FMM) to calculate the pedestrian arrival time for each position in the map and then, the likelihood that the pedestrian reaches those positions is estimated. The technique has been tested with synthetic and real scenarios. In all cases, accurate probability maps as well as their representative graphs were obtained with low computational cost.
Resumo:
Objective: To determine whether poverty and unemployment increase the likelihood of or delay recovery from common mental disorders, and whether these associations could be explained by subjective financial strain.
Resumo:
A maximum likelihood estimator based on the coalescent for unequal migration rates and different subpopulation sizes is developed. The method uses a Markov chain Monte Carlo approach to investigate possible genealogies with branch lengths and with migration events. Properties of the new method are shown by using simulated data from a four-population n-island model and a source–sink population model. Our estimation method as coded in migrate is tested against genetree; both programs deliver a very similar likelihood surface. The algorithm converges to the estimates fairly quickly, even when the Markov chain is started from unfavorable parameters. The method was used to estimate gene flow in the Nile valley by using mtDNA data from three human populations.
Resumo:
Phylogenetic analyses are increasingly used in attempts to clarify transmission patterns of human immunodeficiency virus type 1 (HIV-1), but there is a continuing discussion about their validity because convergent evolution and transmission of minor HIV variants may obscure epidemiological patterns. Here we have studied a unique HIV-1 transmission cluster consisting of nine infected individuals, for whom the time and direction of each virus transmission was exactly known. Most of the transmissions occurred between 1981 and 1983, and a total of 13 blood samples were obtained approximately 2-12 years later. The p17 gag and env V3 regions of the HIV-1 genome were directly sequenced from uncultured lymphocytes. A true phylogenetic tree was constructed based on the knowledge about when the transmissions had occurred and when the samples were obtained. This complex, known HIV-1 transmission history was compared with reconstructed molecular trees, which were calculated from the DNA sequences by several commonly used phylogenetic inference methods [Fitch-Margoliash, neighbor-joining, minimum-evolution, maximum-likelihood, maximum-parsimony, unweighted pair group method using arithmetic averages (UPGMA), and a Fitch-Margoliash method assuming a molecular clock (KITSCH)]. A majority of the reconstructed trees were good estimates of the true phylogeny; 12 of 13 taxa were correctly positioned in the most accurate trees. The choice of gene fragment was found to be more important than the choice of phylogenetic method and substitution model. However, methods that are sensitive to unequal rates of change performed more poorly (such as UPGMA and KITSCH, which assume a constant molecular clock). The rapidly evolving V3 fragment gave better reconstructions than p17, but a combined data set of both p17 and V3 performed best. The accuracy of the phylogenetic methods justifies their use in HIV-1 research and argues against convergent evolution and selective transmission of certain virus variants.
Resumo:
The genes for the protein synthesis elongation factors Tu (EF-Tu) and G (EF-G) are the products of an ancient gene duplication, which appears to predate the divergence of all extant organismal lineages. Thus, it should be possible to root a universal phylogeny based on either protein using the second protein as an outgroup. This approach was originally taken independently with two separate gene duplication pairs, (i) the regulatory and catalytic subunits of the proton ATPases and (ii) the protein synthesis elongation factors EF-Tu and EF-G. Questions about the orthology of the ATPase genes have obscured the former results, and the elongation factor data have been criticized for inadequate taxonomic representation and alignment errors. We have expanded the latter analysis using a broad representation of taxa from all three domains of life. All phylogenetic methods used strongly place the root of the universal tree between two highly distinct groups, the archaeons/eukaryotes and the eubacteria. We also find that a combined data set of EF-Tu and EF-G sequences favors placement of the eukaryotes within the Archaea, as the sister group to the Crenarchaeota. This relationship is supported by bootstrap values of 60-89% with various distance and maximum likelihood methods, while unweighted parsimony gives 58% support for archaeal monophyly.
Resumo:
The origin of land vertebrates was one of the major transitions in the history of vertebrates. Yet, despite many studies that are based on either morphology or molecules, the phylogenetic relationships among tetrapods and the other two living groups of lobe-finned fishes, the coelacanth and the lungfishes, are still unresolved and debated. Knowledge of the relationships among these lineages, which originated back in the Devonian, has profound implications for the reconstruction of the evolutionary scenario of the conquest of land. We collected the largest molecular data set on this issue so far, about 3,500 base pairs from seven species of the large 28S nuclear ribosomal gene. All phylogenetic analyses (maximum parsimony, neighbor-joining, and maximum likelihood) point toward the hypothesis that lungfishes and coelacanths form a monophyletic group and are equally closely related to land vertebrates. This evolutionary hypothesis complicates the identification of morphological or physiological preadaptations that might have permitted the common ancestor of tetrapods to colonize land. This is because the reconstruction of its ancestral conditions would be hindered by the difficulty to separate uniquely derived characters from shared derived characters in the coelacanth/lungfish and tetrapod lineages. This molecular phylogeny aids in the reconstruction of morphological evolutionary steps by providing a framework; however, only paleontological evidence can determine the sequence of morphological acquisitions that allowed lobe-finned fishes to colonize land.
Resumo:
Neste trabalho propomos o uso de um método Bayesiano para estimar o parâmetro de memória de um processo estocástico com memória longa quando sua função de verossimilhança é intratável ou não está disponível. Esta abordagem fornece uma aproximação para a distribuição a posteriori sobre a memória e outros parâmetros e é baseada numa aplicação simples do método conhecido como computação Bayesiana aproximada (ABC). Alguns estimadores populares para o parâmetro de memória serão revisados e comparados com esta abordagem. O emprego de nossa proposta viabiliza a solução de problemas complexos sob o ponto de vista Bayesiano e, embora aproximativa, possui um desempenho muito satisfatório quando comparada com métodos clássicos.
Resumo:
Many applications including object reconstruction, robot guidance, and. scene mapping require the registration of multiple views from a scene to generate a complete geometric and appearance model of it. In real situations, transformations between views are unknown and it is necessary to apply expert inference to estimate them. In the last few years, the emergence of low-cost depth-sensing cameras has strengthened the research on this topic, motivating a plethora of new applications. Although they have enough resolution and accuracy for many applications, some situations may not be solved with general state-of-the-art registration methods due to the signal-to-noise ratio (SNR) and the resolution of the data provided. The problem of working with low SNR data, in general terms, may appear in any 3D system, then it is necessary to propose novel solutions in this aspect. In this paper, we propose a method, μ-MAR, able to both coarse and fine register sets of 3D points provided by low-cost depth-sensing cameras, despite it is not restricted to these sensors, into a common coordinate system. The method is able to overcome the noisy data problem by means of using a model-based solution of multiplane registration. Specifically, it iteratively registers 3D markers composed by multiple planes extracted from points of multiple views of the scene. As the markers and the object of interest are static in the scenario, the transformations obtained for the markers are applied to the object in order to reconstruct it. Experiments have been performed using synthetic and real data. The synthetic data allows a qualitative and quantitative evaluation by means of visual inspection and Hausdorff distance respectively. The real data experiments show the performance of the proposal using data acquired by a Primesense Carmine RGB-D sensor. The method has been compared to several state-of-the-art methods. The results show the good performance of the μ-MAR to register objects with high accuracy in presence of noisy data outperforming the existing methods.
Resumo:
We consider the problem of assessing the number of clusters in a limited number of tissue samples containing gene expressions for possibly several thousands of genes. It is proposed to use a normal mixture model-based approach to the clustering of the tissue samples. One advantage of this approach is that the question on the number of clusters in the data can be formulated in terms of a test on the smallest number of components in the mixture model compatible with the data. This test can be carried out on the basis of the likelihood ratio test statistic, using resampling to assess its null distribution. The effectiveness of this approach is demonstrated on simulated data and on some microarray datasets, as considered previously in the bioinformatics literature. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
Mixture models implemented via the expectation-maximization (EM) algorithm are being increasingly used in a wide range of problems in pattern recognition such as image segmentation. However, the EM algorithm requires considerable computational time in its application to huge data sets such as a three-dimensional magnetic resonance (MR) image of over 10 million voxels. Recently, it was shown that a sparse, incremental version of the EM algorithm could improve its rate of convergence. In this paper, we show how this modified EM algorithm can be speeded up further by adopting a multiresolution kd-tree structure in performing the E-step. The proposed algorithm outperforms some other variants of the EM algorithm for segmenting MR images of the human brain. (C) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
Resumo:
Objective: Inpatient length of stay (LOS) is an important measure of hospital activity, health care resource consumption, and patient acuity. This research work aims at developing an incremental expectation maximization (EM) based learning approach on mixture of experts (ME) system for on-line prediction of LOS. The use of a batchmode learning process in most existing artificial neural networks to predict LOS is unrealistic, as the data become available over time and their pattern change dynamically. In contrast, an on-line process is capable of providing an output whenever a new datum becomes available. This on-the-spot information is therefore more useful and practical for making decisions, especially when one deals with a tremendous amount of data. Methods and material: The proposed approach is illustrated using a real example of gastroenteritis LOS data. The data set was extracted from a retrospective cohort study on all infants born in 1995-1997 and their subsequent admissions for gastroenteritis. The total number of admissions in this data set was n = 692. Linked hospitalization records of the cohort were retrieved retrospectively to derive the outcome measure, patient demographics, and associated co-morbidities information. A comparative study of the incremental learning and the batch-mode learning algorithms is considered. The performances of the learning algorithms are compared based on the mean absolute difference (MAD) between the predictions and the actual LOS, and the proportion of predictions with MAD < 1 day (Prop(MAD < 1)). The significance of the comparison is assessed through a regression analysis. Results: The incremental learning algorithm provides better on-line prediction of LOS when the system has gained sufficient training from more examples (MAD = 1.77 days and Prop(MAD < 1) = 54.3%), compared to that using the batch-mode learning. The regression analysis indicates a significant decrease of MAD (p-value = 0.063) and a significant (p-value = 0.044) increase of Prop(MAD