191 resultados para RANDOM LATTICE FIELDS

em Queensland University of Technology - ePrints Archive


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Log-linear and maximum-margin models are two commonly-used methods in supervised machine learning, and are frequently used in structured prediction problems. Efficient learning of parameters in these models is therefore an important problem, and becomes a key factor when learning from very large data sets. This paper describes exponentiated gradient (EG) algorithms for training such models, where EG updates are applied to the convex dual of either the log-linear or max-margin objective function; the dual in both the log-linear and max-margin cases corresponds to minimizing a convex function with simplex constraints. We study both batch and online variants of the algorithm, and provide rates of convergence for both cases. In the max-margin case, O(1/ε) EG updates are required to reach a given accuracy ε in the dual; in contrast, for log-linear models only O(log(1/ε)) updates are required. For both the max-margin and log-linear cases, our bounds suggest that the online EG algorithm requires a factor of n less computation to reach a desired accuracy than the batch EG algorithm, where n is the number of training examples. Our experiments confirm that the online algorithms are much faster than the batch algorithms in practice. We describe how the EG updates factor in a convenient way for structured prediction problems, allowing the algorithms to be efficiently applied to problems such as sequence learning or natural language parsing. We perform extensive evaluation of the algorithms, comparing them to L-BFGS and stochastic gradient descent for log-linear models, and to SVM-Struct for max-margin models. The algorithms are applied to a multi-class problem as well as to a more complex large-scale parsing task. In all these settings, the EG algorithms presented here outperform the other methods.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Australian e-Health Research Centre (AEHRC) recently participated in the ShARe/CLEF eHealth Evaluation Lab Task 1. The goal of this task is to individuate mentions of disorders in free-text electronic health records and map disorders to SNOMED CT concepts in the UMLS metathesaurus. This paper details our participation to this ShARe/CLEF task. Our approaches are based on using the clinical natural language processing tool Metamap and Conditional Random Fields (CRF) to individuate mentions of disorders and then to map those to SNOMED CT concepts. Empirical results obtained on the 2013 ShARe/CLEF task highlight that our instance of Metamap (after ltering irrelevant semantic types), although achieving a high level of precision, is only able to identify a small amount of disorders (about 21% to 28%) from free-text health records. On the other hand, the addition of the CRF models allows for a much higher recall (57% to 79%) of disorders from free-text, without sensible detriment in precision. When evaluating the accuracy of the mapping of disorders to SNOMED CT concepts in the UMLS, we observe that the mapping obtained by our ltered instance of Metamap delivers state-of-the-art e ectiveness if only spans individuated by our system are considered (`relaxed' accuracy).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Active learning approaches reduce the annotation cost required by traditional supervised approaches to reach the same effectiveness by actively selecting informative instances during the learning phase. However, effectiveness and robustness of the learnt models are influenced by a number of factors. In this paper we investigate the factors that affect the effectiveness, more specifically in terms of stability and robustness, of active learning models built using conditional random fields (CRFs) for information extraction applications. Stability, defined as a small variation of performance when small variation of the training data or a small variation of the parameters occur, is a major issue for machine learning models, but even more so in the active learning framework which aims to minimise the amount of training data required. The factors we investigate are a) the choice of incremental vs. standard active learning, b) the feature set used as a representation of the text (i.e., morphological features, syntactic features, or semantic features) and c) Gaussian prior variance as one of the important CRFs parameters. Our empirical findings show that incremental learning and the Gaussian prior variance lead to more stable and robust models across iterations. Our study also demonstrates that orthographical, morphological and contextual features as a group of basic features play an important role in learning effective models across all iterations.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Product reviews are the foremost source of information for customers and manufacturers to help them make appropriate purchasing and production decisions. Natural language data is typically very sparse; the most common words are those that do not carry a lot of semantic content, and occurrences of any particular content-bearing word are rare, while co-occurrences of these words are rarer. Mining product aspects, along with corresponding opinions, is essential for Aspect-Based Opinion Mining (ABOM) as a result of the e-commerce revolution. Therefore, the need for automatic mining of reviews has reached a peak. In this work, we deal with ABOM as sequence labelling problem and propose a supervised extraction method to identify product aspects and corresponding opinions. We use Conditional Random Fields (CRFs) to solve the extraction problem and propose a feature function to enhance accuracy. The proposed method is evaluated using two different datasets. We also evaluate the effectiveness of feature function and the optimisation through multiple experiments.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Random walk models are often used to interpret experimental observations of the motion of biological cells and molecules. A key aim in applying a random walk model to mimic an in vitro experiment is to estimate the Fickian diffusivity (or Fickian diffusion coefficient),D. However, many in vivo experiments are complicated by the fact that the motion of cells and molecules is hindered by the presence of obstacles. Crowded transport processes have been modeled using repeated stochastic simulations in which a motile agent undergoes a random walk on a lattice that is populated by immobile obstacles. Early studies considered the most straightforward case in which the motile agent and the obstacles are the same size. More recent studies considered stochastic random walk simulations describing the motion of an agent through an environment populated by obstacles of different shapes and sizes. Here, we build on previous simulation studies by analyzing a general class of lattice-based random walk models with agents and obstacles of various shapes and sizes. Our analysis provides exact calculations of the Fickian diffusivity, allowing us to draw conclusions about the role of the size, shape and density of the obstacles, as well as examining the role of the size and shape of the motile agent. Since our analysis is exact, we calculateDdirectly without the need for random walk simulations. In summary, we find that the shape, size and density of obstacles has a major influence on the exact Fickian diffusivity. Furthermore, our results indicate that the difference in diffusivity for symmetric and asymmetric obstacles is significant.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Deep convolutional neural networks (DCNNs) have been employed in many computer vision tasks with great success due to their robustness in feature learning. One of the advantages of DCNNs is their representation robustness to object locations, which is useful for object recognition tasks. However, this also discards spatial information, which is useful when dealing with topological information of the image (e.g. scene labeling, face recognition). In this paper, we propose a deeper and wider network architecture to tackle the scene labeling task. The depth is achieved by incorporating predictions from multiple early layers of the DCNN. The width is achieved by combining multiple outputs of the network. We then further refine the parsing task by adopting graphical models (GMs) as a post-processing step to incorporate spatial and contextual information into the network. The new strategy for a deeper, wider convolutional network coupled with graphical models has shown promising results on the PASCAL-Context dataset.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Using six kinds of lattice types (4×4 ,5×5 , and6×6 square lattices;3×3×3 cubic lattice; and2+3+4+3+2 and4+5+6+5+4 triangular lattices), three different size alphabets (HP ,HNUP , and 20 letters), and two energy functions, the designability of proteinstructures is calculated based on random samplings of structures and common biased sampling (CBS) of proteinsequence space. Then three quantities stability (average energy gap),foldability, and partnum of the structure, which are defined to elucidate the designability, are calculated. The authors find that whatever the type of lattice, alphabet size, and energy function used, there will be an emergence of highly designable (preferred) structure. For all cases considered, the local interactions reduce degeneracy and make the designability higher. The designability is sensitive to the lattice type, alphabet size, energy function, and sampling method of the sequence space. Compared with the random sampling method, both the CBS and the Metropolis Monte Carlo sampling methods make the designability higher. The correlation coefficients between the designability, stability, and foldability are mostly larger than 0.5, which demonstrate that they have strong correlation relationship. But the correlation relationship between the designability and the partnum is not so strong because the partnum is independent of the energy. The results are useful in practical use of the designability principle, such as to predict the proteintertiary structure.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Individual-based models describing the migration and proliferation of a population of cells frequently restrict the cells to a predefined lattice. An implicit assumption of this type of lattice based model is that a proliferative population will always eventually fill the lattice. Here we develop a new lattice-free individual-based model that incorporates cell-to-cell crowding effects. We also derive approximate mean-field descriptions for the lattice-free model in two special cases motivated by commonly used experimental setups. Lattice-free simulation results are compared to these mean-field descriptions and to a corresponding lattice-based model. Data from a proliferation experiment is used to estimate the parameters for the new model, including the cell proliferation rate, showing that the model fits the data well. An important aspect of the lattice-free model is that the confluent cell density is not predefined, as with lattice-based models, but an emergent model property. As a consequence of the more realistic, irregular configuration of cells in the lattice-free model, the population growth rate is much slower at high cell densities and the population cannot reach the same confluent density as an equivalent lattice-based model.