941 resultados para Conditional Heteroskedasticity
Resumo:
utomatic pain monitoring has the potential to greatly improve patient diagnosis and outcomes by providing a continuous objective measure. One of the most promising methods is to do this via automatically detecting facial expressions. However, current approaches have failed due to their inability to: 1) integrate the rigid and non-rigid head motion into a single feature representation, and 2) incorporate the salient temporal patterns into the classification stage. In this paper, we tackle the first problem by developing a “histogram of facial action units” representation using Active Appearance Model (AAM) face features, and then utilize a Hidden Conditional Random Field (HCRF) to overcome the second issue. We show that both of these methods improve the performance on the task of pain detection in sequence level compared to current state-of-the-art-methods on the UNBC-McMaster Shoulder Pain Archive.
Resumo:
The Australian e-Health Research Centre (AEHRC) recently participated in the ShARe/CLEF eHealth Evaluation Lab Task 1. The goal of this task is to individuate mentions of disorders in free-text electronic health records and map disorders to SNOMED CT concepts in the UMLS metathesaurus. This paper details our participation to this ShARe/CLEF task. Our approaches are based on using the clinical natural language processing tool Metamap and Conditional Random Fields (CRF) to individuate mentions of disorders and then to map those to SNOMED CT concepts. Empirical results obtained on the 2013 ShARe/CLEF task highlight that our instance of Metamap (after ltering irrelevant semantic types), although achieving a high level of precision, is only able to identify a small amount of disorders (about 21% to 28%) from free-text health records. On the other hand, the addition of the CRF models allows for a much higher recall (57% to 79%) of disorders from free-text, without sensible detriment in precision. When evaluating the accuracy of the mapping of disorders to SNOMED CT concepts in the UMLS, we observe that the mapping obtained by our ltered instance of Metamap delivers state-of-the-art e ectiveness if only spans individuated by our system are considered (`relaxed' accuracy).
Resumo:
Using American panel data from the National Education Longitudinal Study of 1988, this article investigates the effect of working during grade 12 on attainment.We employ, for the first time in the related literature, a semiparametric propensity score matching approach combined with difference-in-differences. We address selection on both observables and unobservables associated with part-time work decisions, without the need for instrumental variable. Once such factors are controlled for, little to no effects on reading and math scores are found. Overall, our results therefore suggest a negligible academic cost from part-time working by the end of high school.
Resumo:
A spatial process observed over a lattice or a set of irregular regions is usually modeled using a conditionally autoregressive (CAR) model. The neighborhoods within a CAR model are generally formed deterministically using the inter-distances or boundaries between the regions. An extension of CAR model is proposed in this article where the selection of the neighborhood depends on unknown parameter(s). This extension is called a Stochastic Neighborhood CAR (SNCAR) model. The resulting model shows flexibility in accurately estimating covariance structures for data generated from a variety of spatial covariance models. Specific examples are illustrated using data generated from some common spatial covariance functions as well as real data concerning radioactive contamination of the soil in Switzerland after the Chernobyl accident.
Resumo:
Modular arithmetic has often been regarded as something of a mathematical curiosity, at least by those unfamiliar with its importance to both abstract algebra and number theory, and with its numerous applications. However, with the ubiquity of fast digital computers, and the need for reliable digital security systems such as RSA, this important branch of mathematics is now considered essential knowledge for many professionals. Indeed, computer arithmetic itself is, ipso facto, modular. This chapter describes how the modern graphical spreadsheet may be used to clearly illustrate the basics of modular arithmetic, and to solve certain classes of problems. Students may then gain structural insight and the foundations laid for applications to such areas as hashing, random number generation, and public-key cryptography.
Resumo:
In this paper we propose a new multivariate GARCH model with time-varying conditional correlation structure. The time-varying conditional correlations change smoothly between two extreme states of constant correlations according to a predetermined or exogenous transition variable. An LM–test is derived to test the constancy of correlations and LM- and Wald tests to test the hypothesis of partially constant correlations. Analytical expressions for the test statistics and the required derivatives are provided to make computations feasible. An empirical example based on daily return series of five frequently traded stocks in the S&P 500 stock index completes the paper.
Resumo:
Active learning approaches reduce the annotation cost required by traditional supervised approaches to reach the same effectiveness by actively selecting informative instances during the learning phase. However, effectiveness and robustness of the learnt models are influenced by a number of factors. In this paper we investigate the factors that affect the effectiveness, more specifically in terms of stability and robustness, of active learning models built using conditional random fields (CRFs) for information extraction applications. Stability, defined as a small variation of performance when small variation of the training data or a small variation of the parameters occur, is a major issue for machine learning models, but even more so in the active learning framework which aims to minimise the amount of training data required. The factors we investigate are a) the choice of incremental vs. standard active learning, b) the feature set used as a representation of the text (i.e., morphological features, syntactic features, or semantic features) and c) Gaussian prior variance as one of the important CRFs parameters. Our empirical findings show that incremental learning and the Gaussian prior variance lead to more stable and robust models across iterations. Our study also demonstrates that orthographical, morphological and contextual features as a group of basic features play an important role in learning effective models across all iterations.
Resumo:
This paper presents an efficient noniterative method for distribution state estimation using conditional multivariate complex Gaussian distribution (CMCGD). In the proposed method, the mean and standard deviation (SD) of the state variables is obtained in one step considering load uncertainties, measurement errors, and load correlations. In this method, first the bus voltages, branch currents, and injection currents are represented by MCGD using direct load flow and a linear transformation. Then, the mean and SD of bus voltages, or other states, are calculated using CMCGD and estimation of variance method. The mean and SD of pseudo measurements, as well as spatial correlations between pseudo measurements, are modeled based on the historical data for different levels of load duration curve. The proposed method can handle load uncertainties without using time-consuming approaches such as Monte Carlo. Simulation results of two case studies, six-bus, and a realistic 747-bus distribution network show the effectiveness of the proposed method in terms of speed, accuracy, and quality against the conventional approach.
Resumo:
Product reviews are the foremost source of information for customers and manufacturers to help them make appropriate purchasing and production decisions. Natural language data is typically very sparse; the most common words are those that do not carry a lot of semantic content, and occurrences of any particular content-bearing word are rare, while co-occurrences of these words are rarer. Mining product aspects, along with corresponding opinions, is essential for Aspect-Based Opinion Mining (ABOM) as a result of the e-commerce revolution. Therefore, the need for automatic mining of reviews has reached a peak. In this work, we deal with ABOM as sequence labelling problem and propose a supervised extraction method to identify product aspects and corresponding opinions. We use Conditional Random Fields (CRFs) to solve the extraction problem and propose a feature function to enhance accuracy. The proposed method is evaluated using two different datasets. We also evaluate the effectiveness of feature function and the optimisation through multiple experiments.
Resumo:
Nephrin is a transmembrane protein belonging to the immunoglobulin superfamily and is expressed primarily in the podocytes, which are highly differentiated epithelial cells needed for primary urine formation in the kidney. Mutations leading to nephrin loss abrogate podocyte morphology, and result in massive protein loss into urine and consequent early death in humans carrying specific mutations in this gene. The disease phenotype is closely replicated in respective mouse models. The purpose of this thesis was to generate novel inducible mouse-lines, which allow targeted gene deletion in a time and tissue-specific manner. A proof of principle model for succesful gene therapy for this disease was generated, which allowed podocyte specific transgene replacement to rescue gene deficient mice from perinatal lethality. Furthermore, the phenotypic consequences of nephrin restoration in the kidney and nephrin deficiency in the testis, brain and pancreas in rescued mice were investigated. A novel podocyte-specific construct was achieved by using standard cloning techniques to provide an inducible tool for in vitro and in vivo gene targeting. Using modified constructs and microinjection procedures two novel transgenic mouse-lines were generated. First, a mouse-line with doxycycline inducible expression of Cre recombinase that allows podocyte-specific gene deletion was generated. Second, a mouse-line with doxycycline inducible expression of rat nephrin, which allows podocyte-specific nephrin over-expression was made. Furthermore, it was possible to rescue nephrin deficient mice from perinatal lethality by cross-breeding them with a mouse-line with inducible rat nephrin expression that restored the missing endogenous nephrin only in the kidney after doxycycline treatment. The rescued mice were smaller, infertile, showed genital malformations and developed distinct histological abnormalities in the kidney with an altered molecular composition of the podocytes. Histological changes were also found in the testis, cerebellum and pancreas. The expression of another molecule with limited tissue expression, densin, was localized to the plasma membranes of Sertoli cells in the testis by immunofluorescence staining. Densin may be an essential adherens junction protein between Sertoli cells and developing germ cells and these junctions share similar protein assembly with kidney podocytes. This single, binary conditional construct serves as a cost- and time-efficient tool to increase the understanding of podocyte-specific key proteins in health and disease. The results verified a tightly controlled inducible podocyte-specific transgene expression in vitro and in vivo as expected. These novel mouse-lines with doxycycline inducible Cre recombinase and with rat nephrin expression will be useful for conditional gene targeting of essential podocyte proteins and to study in detail their functions in the adult mice. This is important for future diagnostic and pharmacologic development platforms.
Resumo:
Downscaling to station-scale hydrologic variables from large-scale atmospheric variables simulated by general circulation models (GCMs) is usually necessary to assess the hydrologic impact of climate change. This work presents CRF-downscaling, a new probabilistic downscaling method that represents the daily precipitation sequence as a conditional random field (CRF). The conditional distribution of the precipitation sequence at a site, given the daily atmospheric (large-scale) variable sequence, is modeled as a linear chain CRF. CRFs do not make assumptions on independence of observations, which gives them flexibility in using high-dimensional feature vectors. Maximum likelihood parameter estimation for the model is performed using limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) optimization. Maximum a posteriori estimation is used to determine the most likely precipitation sequence for a given set of atmospheric input variables using the Viterbi algorithm. Direct classification of dry/wet days as well as precipitation amount is achieved within a single modeling framework. The model is used to project the future cumulative distribution function of precipitation. Uncertainty in precipitation prediction is addressed through a modified Viterbi algorithm that predicts the n most likely sequences. The model is applied for downscaling monsoon (June-September) daily precipitation at eight sites in the Mahanadi basin in Orissa, India, using the MIROC3.2 medium-resolution GCM. The predicted distributions at all sites show an increase in the number of wet days, and also an increase in wet day precipitation amounts. A comparison of current and future predicted probability density functions for daily precipitation shows a change in shape of the density function with decreasing probability of lower precipitation and increasing probability of higher precipitation.
Resumo:
We consider the problem of detecting statistically significant sequential patterns in multineuronal spike trains. These patterns are characterized by ordered sequences of spikes from different neurons with specific delays between spikes. We have previously proposed a data-mining scheme to efficiently discover such patterns, which occur often enough in the data. Here we propose a method to determine the statistical significance of such repeating patterns. The novelty of our approach is that we use a compound null hypothesis that not only includes models of independent neurons but also models where neurons have weak dependencies. The strength of interaction among the neurons is represented in terms of certain pair-wise conditional probabilities. We specify our null hypothesis by putting an upper bound on all such conditional probabilities. We construct a probabilistic model that captures the counting process and use this to derive a test of significance for rejecting such a compound null hypothesis. The structure of our null hypothesis also allows us to rank-order different significant patterns. We illustrate the effectiveness of our approach using spike trains generated with a simulator.
Resumo:
Deep convolutional neural networks (DCNNs) have been employed in many computer vision tasks with great success due to their robustness in feature learning. One of the advantages of DCNNs is their representation robustness to object locations, which is useful for object recognition tasks. However, this also discards spatial information, which is useful when dealing with topological information of the image (e.g. scene labeling, face recognition). In this paper, we propose a deeper and wider network architecture to tackle the scene labeling task. The depth is achieved by incorporating predictions from multiple early layers of the DCNN. The width is achieved by combining multiple outputs of the network. We then further refine the parsing task by adopting graphical models (GMs) as a post-processing step to incorporate spatial and contextual information into the network. The new strategy for a deeper, wider convolutional network coupled with graphical models has shown promising results on the PASCAL-Context dataset.
Resumo:
A better performing product code vector quantization (VQ) method is proposed for coding the line spectrum frequency (LSF) parameters; the method is referred to as sequential split vector quantization (SeSVQ). The split sub-vectors of the full LSF vector are quantized in sequence and thus uses conditional distribution derived from the previous quantized sub-vectors. Unlike the traditional split vector quantization (SVQ) method, SeSVQ exploits the inter sub-vector correlation and thus provides improved rate-distortion performance, but at the expense of higher memory. We investigate the quantization performance of SeSVQ over traditional SVQ and transform domain split VQ (TrSVQ) methods. Compared to SVQ, SeSVQ saves 1 bit and nearly 3 bits, for telephone-band and wide-band speech coding applications respectively.