364 resultados para Dataset


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abnormal event detection has attracted a lot of attention in the computer vision research community during recent years due to the increased focus on automated surveillance systems to improve security in public places. Due to the scarcity of training data and the definition of an abnormality being dependent on context, abnormal event detection is generally formulated as a data-driven approach where activities are modeled in an unsupervised fashion during the training phase. In this work, we use a Gaussian mixture model (GMM) to cluster the activities during the training phase, and propose a Gaussian mixture model based Markov random field (GMM-MRF) to estimate the likelihood scores of new videos in the testing phase. Further-more, we propose two new features: optical acceleration, and the histogram of optical flow gradients; to detect the presence of any abnormal objects and speed violations in the scene. We show that our proposed method outperforms other state of the art abnormal event detection algorithms on publicly available UCSD dataset.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Fine-grained leaf classification has concentrated on the use of traditional shape and statistical features to classify ideal images. In this paper we evaluate the effectiveness of traditional hand-crafted features and propose the use of deep convolutional neural network (ConvNet) features. We introduce a range of condition variations to explore the robustness of these features, including: translation, scaling, rotation, shading and occlusion. Evaluations on the Flavia dataset demonstrate that in ideal imaging conditions, combining traditional and ConvNet features yields state-of-theart performance with an average accuracy of 97:3%�0:6% compared to traditional features which obtain an average accuracy of 91:2%�1:6%. Further experiments show that this combined classification approach consistently outperforms the best set of traditional features by an average of 5:7% for all of the evaluated condition variations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Description of a patient's injuries is recorded in narrative text form by hospital emergency departments. For statistical reporting, this text data needs to be mapped to pre-defined codes. Existing research in this field uses the Naïve Bayes probabilistic method to build classifiers for mapping. In this paper, we focus on providing guidance on the selection of a classification method. We build a number of classifiers belonging to different classification families such as decision tree, probabilistic, neural networks, and instance-based, ensemble-based and kernel-based linear classifiers. An extensive pre-processing is carried out to ensure the quality of data and, in hence, the quality classification outcome. The records with a null entry in injury description are removed. The misspelling correction process is carried out by finding and replacing the misspelt word with a soundlike word. Meaningful phrases have been identified and kept, instead of removing the part of phrase as a stop word. The abbreviations appearing in many forms of entry are manually identified and only one form of abbreviations is used. Clustering is utilised to discriminate between non-frequent and frequent terms. This process reduced the number of text features dramatically from about 28,000 to 5000. The medical narrative text injury dataset, under consideration, is composed of many short documents. The data can be characterized as high-dimensional and sparse, i.e., few features are irrelevant but features are correlated with one another. Therefore, Matrix factorization techniques such as Singular Value Decomposition (SVD) and Non Negative Matrix Factorization (NNMF) have been used to map the processed feature space to a lower-dimensional feature space. Classifiers with these reduced feature space have been built. In experiments, a set of tests are conducted to reflect which classification method is best for the medical text classification. The Non Negative Matrix Factorization with Support Vector Machine method can achieve 93% precision which is higher than all the tested traditional classifiers. We also found that TF/IDF weighting which works well for long text classification is inferior to binary weighting in short document classification. Another finding is that the Top-n terms should be removed in consultation with medical experts, as it affects the classification performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper evaluates the performance of different text recognition techniques for a mobile robot in an indoor (university campus) environment. We compared four different methods: our own approach using existing text detection methods (Minimally Stable Extremal Regions detector and Stroke Width Transform) combined with a convolutional neural network, two modes of the open source program Tesseract, and the experimental mobile app Google Goggles. The results show that a convolutional neural network combined with the Stroke Width Transform gives the best performance in correctly matched text on images with single characters whereas Google Goggles gives the best performance on images with multiple words. The dataset used for this work is released as well.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a single pass algorithm for mining discriminative Itemsets in data streams using a novel data structure and the tilted-time window model. Discriminative Itemsets are defined as Itemsets that are frequent in one data stream and their frequency in that stream is much higher than the rest of the streams in the dataset. In order to deal with the data structure size, we propose a pruning process that results in the compact tree structure containing discriminative Itemsets. Empirical analysis shows the sound time and space complexity of the proposed method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background The koala, Phascolarctos cinereus, is a biologically unique and evolutionarily distinct Australian arboreal marsupial. The goal of this study was to sequence the transcriptome from several tissues of two geographically separate koalas, and to create the first comprehensive catalog of annotated transcripts for this species, enabling detailed analysis of the unique attributes of this threatened native marsupial, including infection by the koala retrovirus. Results RNA-Seq data was generated from a range of tissues from one male and one female koala and assembled de novo into transcripts using Velvet-Oases. Transcript abundance in each tissue was estimated. Transcripts were searched for likely protein-coding regions and a non-redundant set of 117,563 putative protein sequences was produced. In similarity searches there were 84,907 (72%) sequences that aligned to at least one sequence in the NCBI nr protein database. The best alignments were to sequences from other marsupials. After applying a reciprocal best hit requirement of koala sequences to those from tammar wallaby, Tasmanian devil and the gray short-tailed opossum, we estimate that our transcriptome dataset represents approximately 15,000 koala genes. The marsupial alignment information was used to look for potential gene duplications and we report evidence for copy number expansion of the alpha amylase gene, and of an aldehyde reductase gene. Koala retrovirus (KoRV) transcripts were detected in the transcriptomes. These were analysed in detail and the structure of the spliced envelope gene transcript was determined. There was appreciable sequence diversity within KoRV, with 233 sites in the KoRV genome showing small insertions/deletions or single nucleotide polymorphisms. Both koalas had sequences from the KoRV-A subtype, but the male koala transcriptome has, in addition, sequences more closely related to the KoRV-B subtype. This is the first report of a KoRV-B-like sequence in a wild population. Conclusions This transcriptomic dataset is a useful resource for molecular genetic studies of the koala, for evolutionary genetic studies of marsupials, for validation and annotation of the koala genome sequence, and for investigation of koala retrovirus. Annotated transcripts can be browsed and queried at http://koalagenome.org

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we propose the hybrid use of illuminant invariant and RGB images to perform image classification of urban scenes despite challenging variation in lighting conditions. Coping with lighting change (and the shadows thereby invoked) is a non-negotiable requirement for long term autonomy using vision. One aspect of this is the ability to reliably classify scene components in the presence of marked and often sudden changes in lighting. This is the focus of this paper. Posed with the task of classifying all parts in a scene from a full colour image, we propose that lighting invariant transforms can reduce the variability of the scene, resulting in a more reliable classification. We leverage the ideas of “data transfer” for classification, beginning with full colour images for obtaining candidate scene-level matches using global image descriptors. This is commonly followed by superpixellevel matching with local features. However, we show that if the RGB images are subjected to an illuminant invariant transform before computing the superpixel-level features, classification is significantly more robust to scene illumination effects. The approach is evaluated using three datasets. The first being our own dataset and the second being the KITTI dataset using manually generated ground truth for quantitative analysis. We qualitatively evaluate the method on a third custom dataset over a 750m trajectory.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

1.Marine ecosystems provide critically important goods and services to society, and hence their accelerated degradation underpins an urgent need to take rapid, ambitious and informed decisions regarding their conservation and management. 2.The capacity, however, to generate the detailed field data required to inform conservation planning at appropriate scales is limited by time and resource consuming methods for collecting and analysing field data at the large scales required. 3.The ‘Catlin Seaview Survey’, described here, introduces a novel framework for large-scale monitoring of coral reefs using high-definition underwater imagery collected using customized underwater vehicles in combination with computer vision and machine learning. This enables quantitative and geo-referenced outputs of coral reef features such as habitat types, benthic composition, and structural complexity (rugosity) to be generated across multiple kilometre-scale transects with a spatial resolution ranging from 2 to 6 m2. 4.The novel application of technology described here has enormous potential to contribute to our understanding of coral reefs and associated impacts by underpinning management decisions with kilometre-scale measurements of reef health. 5.Imagery datasets from an initial survey of 500 km of seascape are freely available through an online tool called the Catlin Global Reef Record. Outputs from the image analysis using the technologies described here will be updated on the online repository as work progresses on each dataset. 6.Case studies illustrate the utility of outputs as well as their potential to link to information from remote sensing. The potential implications of the innovative technologies on marine resource management and conservation are also discussed, along with the accuracy and efficiency of the methodologies deployed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Interpolation techniques for spatial data have been applied frequently in various fields of geosciences. Although most conventional interpolation methods assume that it is sufficient to use first- and second-order statistics to characterize random fields, researchers have now realized that these methods cannot always provide reliable interpolation results, since geological and environmental phenomena tend to be very complex, presenting non-Gaussian distribution and/or non-linear inter-variable relationship. This paper proposes a new approach to the interpolation of spatial data, which can be applied with great flexibility. Suitable cross-variable higher-order spatial statistics are developed to measure the spatial relationship between the random variable at an unsampled location and those in its neighbourhood. Given the computed cross-variable higher-order spatial statistics, the conditional probability density function (CPDF) is approximated via polynomial expansions, which is then utilized to determine the interpolated value at the unsampled location as an expectation. In addition, the uncertainty associated with the interpolation is quantified by constructing prediction intervals of interpolated values. The proposed method is applied to a mineral deposit dataset, and the results demonstrate that it outperforms kriging methods in uncertainty quantification. The introduction of the cross-variable higher-order spatial statistics noticeably improves the quality of the interpolation since it enriches the information that can be extracted from the observed data, and this benefit is substantial when working with data that are sparse or have non-trivial dependence structures.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A tag-based item recommendation method generates an ordered list of items, likely interesting to a particular user, using the users past tagging behaviour. However, the users tagging behaviour varies in different tagging systems. A potential problem in generating quality recommendation is how to build user profiles, that interprets user behaviour to be effectively used, in recommendation models. Generally, the recommendation methods are made to work with specific types of user profiles, and may not work well with different datasets. In this paper, we investigate several tagging data interpretation and representation schemes that can lead to building an effective user profile. We discuss the various benefits a scheme brings to a recommendation method by highlighting the representative features of user tagging behaviours on a specific dataset. Empirical analysis shows that each interpretation scheme forms a distinct data representation which eventually affects the recommendation result. Results on various datasets show that an interpretation scheme should be selected based on the dominant usage in the tagging data (i.e. either higher amount of tags or higher amount of items present). The usage represents the characteristic of user tagging behaviour in the system. The results also demonstrate how the scheme is able to address the cold-start user problem.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background and Objectives: Cannabis use is common in early psychosis and has been linked to adverse outcomes. However, factors that influence and maintain change in cannabis use in this population are poorly understood. An existing prospective dataset was used to predict abstinence from cannabis use over the 6 months following inpatient admission for early psychosis. Methods: Participants were 67 inpatients with early psychosis who had used cannabis in the 6 weeks prior to admission. Current diagnoses of psychotic and substance use disorders were confirmed using a clinical checklist and structured diagnostic interview. Measures of clinical, substance use and social and occupational functioning were administered at baseline and at least fortnightly over the 6-month follow up. Results: No substance use or clinical variables were associated with 6-months’ of cannabis abstinence. Only Caucasian ethnicity, living in private accommodation and receiving an income before the admission were predictive. Only private accommodation and receiving an income were significant predictors of abstinence when these variables were entered into a multivariate analysis. Conclusions: While the observed relationships do not necessarily imply causation, they suggest that more optimal substance use outcomes could be achieved by addressing the accommodation and employment needs of patients.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To enhance the efficiency of regression parameter estimation by modeling the correlation structure of correlated binary error terms in quantile regression with repeated measurements, we propose a Gaussian pseudolikelihood approach for estimating correlation parameters and selecting the most appropriate working correlation matrix simultaneously. The induced smoothing method is applied to estimate the covariance of the regression parameter estimates, which can bypass density estimation of the errors. Extensive numerical studies indicate that the proposed method performs well in selecting an accurate correlation structure and improving regression parameter estimation efficiency. The proposed method is further illustrated by analyzing a dental dataset.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Low voltage distribution networks feature a high degree of load unbalance and the addition of rooftop photovoltaic is driving further unbalances in the network. Single phase consumers are distributed across the phases but even if the consumer distribution was well balanced when the network was constructed changes will occur over time. Distribution transformer losses are increased by unbalanced loadings. The estimation of transformer losses is a necessary part of the routine upgrading and replacement of transformers and the identification of the phase connections of households allows a precise estimation of the phase loadings and total transformer loss. This paper presents a new technique and preliminary test results for a method of automatically identifying the phase of each customer by correlating voltage information from the utility's transformer system with voltage information from customer smart meters. The techniques are novel as they are purely based upon a time series of electrical voltage measurements taken at the household and at the distribution transformer. Experimental results using a combination of electrical power and current of the real smart meter datasets demonstrate the performance of our techniques.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recently Convolutional Neural Networks (CNNs) have been shown to achieve state-of-the-art performance on various classification tasks. In this paper, we present for the first time a place recognition technique based on CNN models, by combining the powerful features learnt by CNNs with a spatial and sequential filter. Applying the system to a 70 km benchmark place recognition dataset we achieve a 75% increase in recall at 100% precision, significantly outperforming all previous state of the art techniques. We also conduct a comprehensive performance comparison of the utility of features from all 21 layers for place recognition, both for the benchmark dataset and for a second dataset with more significant viewpoint changes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In studies of germ cell transplantation, measureing tubule diameters and counting cells from different populations using antibodies as markers are very important. Manual measurement of tubule sizes and cell counts is a tedious and sanity grinding work. In this paper, we propose a new boundary weighting based tubule detection method. We first enhance the linear features of the input image and detect the approximate centers of tubules. Next, a boundary weighting transform is applied to the polar transformed image of each tubule region and a circular shortest path is used for the boundary detection. Then, ellipse fitting is carried out for tubule selection and measurement. The algorithm has been tested on a dataset consisting of 20 images, each having about 20 tubules. Experiments show that the detection results of our algorithm are very close to the results obtained manually. © 2013 IEEE.