937 resultados para Classification models
Resumo:
The goal of this work is to learn a parsimonious and informative representation for high-dimensional time series. Conceptually, this comprises two distinct yet tightly coupled tasks: learning a low-dimensional manifold and modeling the dynamical process. These two tasks have a complementary relationship as the temporal constraints provide valuable neighborhood information for dimensionality reduction and conversely, the low-dimensional space allows dynamics to be learnt efficiently. Solving these two tasks simultaneously allows important information to be exchanged mutually. If nonlinear models are required to capture the rich complexity of time series, then the learning problem becomes harder as the nonlinearities in both tasks are coupled. The proposed solution approximates the nonlinear manifold and dynamics using piecewise linear models. The interactions among the linear models are captured in a graphical model. By exploiting the model structure, efficient inference and learning algorithms are obtained without oversimplifying the model of the underlying dynamical process. Evaluation of the proposed framework with competing approaches is conducted in three sets of experiments: dimensionality reduction and reconstruction using synthetic time series, video synthesis using a dynamic texture database, and human motion synthesis, classification and tracking on a benchmark data set. In all experiments, the proposed approach provides superior performance.
Resumo:
We introduce Active Hidden Models (AHM) that utilize kernel methods traditionally associated with classification. We use AHMs to track deformable objects in video sequences by leveraging kernel projections. We introduce the "subset projection" method which improves the efficiency of our tracking approach by a factor of ten. We successfully tested our method on facial tracking with extreme head movements (including full 180-degree head rotation), facial expressions, and deformable objects. Given a kernel and a set of training observations, we derive unbiased estimates of the accuracy of the AHM tracker. Kernels are generally used in classification methods to make training data linearly separable. We prove that the optimal (minimum variance) tracking kernels are those that make the training observations linearly dependent.
Resumo:
The goal of this work is to learn a parsimonious and informative representation for high-dimensional time series. Conceptually, this comprises two distinct yet tightly coupled tasks: learning a low-dimensional manifold and modeling the dynamical process. These two tasks have a complementary relationship as the temporal constraints provide valuable neighborhood information for dimensionality reduction and conversely, the low-dimensional space allows dynamics to be learnt efficiently. Solving these two tasks simultaneously allows important information to be exchanged mutually. If nonlinear models are required to capture the rich complexity of time series, then the learning problem becomes harder as the nonlinearities in both tasks are coupled. The proposed solution approximates the nonlinear manifold and dynamics using piecewise linear models. The interactions among the linear models are captured in a graphical model. The model structure setup and parameter learning are done using a variational Bayesian approach, which enables automatic Bayesian model structure selection, hence solving the problem of over-fitting. By exploiting the model structure, efficient inference and learning algorithms are obtained without oversimplifying the model of the underlying dynamical process. Evaluation of the proposed framework with competing approaches is conducted in three sets of experiments: dimensionality reduction and reconstruction using synthetic time series, video synthesis using a dynamic texture database, and human motion synthesis, classification and tracking on a benchmark data set. In all experiments, the proposed approach provides superior performance.
Resumo:
Air Force Office of Scientific Research (F49620-01-1-0423); National Geospatial-Intelligence Agency (NMA 201-01-1-2016); National Science Foundation (SBE-035437, DEG-0221680); Office of Naval Research (N00014-01-1-0624)
Resumo:
How do humans rapidly recognize a scene? How can neural models capture this biological competence to achieve state-of-the-art scene classification? The ARTSCENE neural system classifies natural scene photographs by using multiple spatial scales to efficiently accumulate evidence for gist and texture. ARTSCENE embodies a coarse-to-fine Texture Size Ranking Principle whereby spatial attention processes multiple scales of scenic information, ranging from global gist to local properties of textures. The model can incrementally learn and predict scene identity by gist information alone and can improve performance through selective attention to scenic textures of progressively smaller size. ARTSCENE discriminates 4 landscape scene categories (coast, forest, mountain and countryside) with up to 91.58% correct on a test set, outperforms alternative models in the literature which use biologically implausible computations, and outperforms component systems that use either gist or texture information alone. Model simulations also show that adjacent textures form higher-order features that are also informative for scene recognition.
Resumo:
A framework for adaptive and non-adaptive statistical compressive sensing is developed, where a statistical model replaces the standard sparsity model of classical compressive sensing. We propose within this framework optimal task-specific sensing protocols specifically and jointly designed for classification and reconstruction. A two-step adaptive sensing paradigm is developed, where online sensing is applied to detect the signal class in the first step, followed by a reconstruction step adapted to the detected class and the observed samples. The approach is based on information theory, here tailored for Gaussian mixture models (GMMs), where an information-theoretic objective relationship between the sensed signals and a representation of the specific task of interest is maximized. Experimental results using synthetic signals, Landsat satellite attributes, and natural images of different sizes and with different noise levels show the improvements achieved using the proposed framework when compared to more standard sensing protocols. The underlying formulation can be applied beyond GMMs, at the price of higher mathematical and computational complexity. © 1991-2012 IEEE.
Resumo:
Learning multiple tasks across heterogeneous domains is a challenging problem since the feature space may not be the same for different tasks. We assume the data in multiple tasks are generated from a latent common domain via sparse domain transforms and propose a latent probit model (LPM) to jointly learn the domain transforms, and the shared probit classifier in the common domain. To learn meaningful task relatedness and avoid over-fitting in classification, we introduce sparsity in the domain transforms matrices, as well as in the common classifier. We derive theoretical bounds for the estimation error of the classifier in terms of the sparsity of domain transforms. An expectation-maximization algorithm is derived for learning the LPM. The effectiveness of the approach is demonstrated on several real datasets.
Resumo:
The tendency for island populations of mammalian taxa to diverge in body size from their mainland counterparts consistently in particular directions is both impressive for its regularity and, especially among rodents, troublesome for its exceptions. However, previous studies have largely ignored mainland body size variation, treating size differences of any magnitude as equally noteworthy. Here, we use distributions of mainland population body sizes to identify island populations as 'extremely' big or small, and we compare traits of extreme populations and their islands with those of island populations more typical in body size. We find that although insular rodents vary in the directions of body size change, 'extreme' populations tend towards gigantism. With classification tree methods, we develop a predictive model, which points to resource limitations as major drivers in the few cases of insular dwarfism. Highly successful in classifying our dataset, our model also successfully predicts change in untested cases.
Resumo:
This paper studies two models of two-stage processing with no-wait in process. The first model is the two-machine flow shop, and the other is the assembly model. For both models we consider the problem of minimizing the makespan, provided that the setup and removal times are separated from the processing times. Each of these scheduling problems is reduced to the Traveling Salesman Problem (TSP). We show that, in general, the assembly problem is NP-hard in the strong sense. On the other hand, the two-machine flow shop problem reduces to the Gilmore-Gomory TSP, and is solvable in polynomial time. The same holds for the assembly problem under some reasonable assumptions. Using these and existing results, we provide a complete complexity classification of the relevant two-stage no-wait scheduling models.
Resumo:
The increasing availability of large, detailed digital representations of the Earth’s surface demands the application of objective and quantitative analyses. Given recent advances in the understanding of the mechanisms of formation of linear bedform features from a range of environments, objective measurement of their wavelength, orientation, crest and trough positions, height and asymmetry is highly desirable. These parameters are also of use when determining observation-based parameters for use in many applications such as numerical modelling, surface classification and sediment transport pathway analysis. Here, we (i) adapt and extend extant techniques to provide a suite of semi-automatic tools which calculate crest orientation, wavelength, height, asymmetry direction and asymmetry ratios of bedforms, and then (ii) undertake sensitivity tests on synthetic data, increasingly complex seabeds and a very large-scale (39 000km2) aeolian dune system. The automated results are compared with traditional, manually derived,measurements at each stage. This new approach successfully analyses different types of topographic data (from aeolian and marine environments) from a range of sources, with tens of millions of data points being processed in a semi-automated and objective manner within minutes rather than hours or days. The results from these analyses show there is significant variability in all measurable parameters in what might otherwise be considered uniform bedform fields. For example, the dunes of the Rub’ al Khali on the Arabian peninsula are shown to exhibit deviations in dimensions from global trends. Morphological and dune asymmetry analysis of the Rub’ al Khali suggests parts of the sand sea may be adjusting to a changed wind regime from that during their formation 100 to 10 ka BP.
Resumo:
An alternative models framework was used to test three confirmatory factor analytic models for the Short Leyton Obsessional Inventory-Children's Version (Short LOI-CV) in a general population sample of 517 young adolescent twins (11-16 years). A one-factor model as implicit in current classification systems of Obsessive-Compulsive Disorder (OCD), a two-factor obsessions and compulsions model, and a multidimensional model corresponding to the three proposed subscales of the Short LOI-CV (labelled Obsessions/Incompleteness, Numbers/Luck and Cleanliness) were considered. The three-factor model was the only model to provide an adequate explanation of the data. Twin analyses suggested significant quantitative sex differences in heritability for both the Obsessions/Incompleteness and Numbers/Luck dimensions with these being significantly heritable in males only (heritability of 60% and 65% respectively). The correlation between the additive genetic effects for these two dimensions in males was 0.95 suggesting they largely share the same genetic risk factors.
Resumo:
The diagnosis of myelodysplastic syndrome (MDS) currently relies primarily on the morphologic assessment of the patient's bone marrow and peripheral blood cells. Moreover, prognostic scoring systems rely on observer-dependent assessments of blast percentage and dysplasia. Gene expression profiling could enhance current diagnostic and prognostic systems by providing a set of standardized, objective gene signatures. Within the Microarray Innovations in LEukemia study, a diagnostic classification model was investigated to distinguish the distinct subclasses of pediatric and adult leukemia, as well as MDS. Overall, the accuracy of the diagnostic classification model for subtyping leukemia was approximately 93%, but this was not reflected for the MDS samples giving only approximately 50% accuracy. Discordant samples of MDS were classified either into acute myeloid leukemia (AML) or
Resumo:
Automatic gender classification has many security and commercial applications. Various modalities have been investigated for gender classification with face-based classification being the most popular. In some real-world scenarios the face may be partially occluded. In these circumstances a classification based on individual parts of the face known as local features must be adopted. We investigate gender classification using lip movements. We show for the first time that important gender specific information can be obtained from the way in which a person moves their lips during speech. Furthermore our study indicates that the lip dynamics during speech provide greater gender discriminative information than simply lip appearance. We also show that the lip dynamics and appearance contain complementary gender information such that a model which captures both traits gives the highest overall classification result. We use Discrete Cosine Transform based features and Gaussian Mixture Modelling to model lip appearance and dynamics and employ the XM2VTS database for our experiments. Our experiments show that a model which captures lip dynamics along with appearance can improve gender classification rates by between 16-21% compared to models of only lip appearance.
Resumo:
Aims/hypothesis: Diabetic nephropathy is a major diabetic complication, and diabetes is the leading cause of end-stage renal disease (ESRD). Family studies suggest a hereditary component for diabetic nephropathy. However, only a few genes have been associated with diabetic nephropathy or ESRD in diabetic patients. Our aim was to detect novel genetic variants associated with diabetic nephropathy and ESRD. Methods: We exploited a novel algorithm, ‘Bag of Naive Bayes’, whose marker selection strategy is complementary to that of conventional genome-wide association models based on univariate association tests. The analysis was performed on a genome-wide association study of 3,464 patients with type 1 diabetes from the Finnish Diabetic Nephropathy (FinnDiane) Study and subsequently replicated with 4,263 type 1 diabetes patients from the Steno Diabetes Centre, the All Ireland-Warren 3-Genetics of Kidneys in Diabetes UK collection (UK–Republic of Ireland) and the Genetics of Kidneys in Diabetes US Study (GoKinD US). Results: Five genetic loci (WNT4/ZBTB40-rs12137135, RGMA/MCTP2-rs17709344, MAPRE1P2-rs1670754, SEMA6D/SLC24A5-rs12917114 and SIK1-rs2838302) were associated with ESRD in the FinnDiane study. An association between ESRD and rs17709344, tagging the previously identified rs12437854 and located between the RGMA and MCTP2 genes, was replicated in independent case–control cohorts. rs12917114 near SEMA6D was associated with ESRD in the replication cohorts under the genotypic model (p < 0.05), and rs12137135 upstream of WNT4 was associated with ESRD in Steno. Conclusions/interpretation: This study supports the previously identified findings on the RGMA/MCTP2 region and suggests novel susceptibility loci for ESRD. This highlights the importance of applying complementary statistical methods to detect novel genetic variants in diabetic nephropathy and, in general, in complex diseases.
Resumo:
In this study, 137 corn distillers dried grains with solubles (DDGS) samples from a range of different geographical origins (Jilin Province of China, Heilongjiang Province of China, USA and Europe) were collected and analysed. Different near infrared spectrometers combined with different chemometric packages were used in two independent laboratories to investigate the feasibility of classifying geographical origin of DDGS. Base on the same dataset, one laboratory developed a partial least square discriminant analysis model and another laboratory developed an orthogonal partial least square discriminant analysis model. Results showed that both models could perfectly classify DDGS samples from different geographical origins. These promising results encourage the development of larger scale efforts to produce datasets which can be used to differentiate the geographical origin of DDGS and such efforts are required to provide higher level food security measures on a global scale.