973 resultados para Scene classification
Resumo:
Background Cancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities. Aims In this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated. Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes. Results Death certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM) classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032) and false negative rate (0.0297) while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers. Conclusion The selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with an SVM classifier.
Resumo:
Recent advances suggest that encoding images through Symmetric Positive Definite (SPD) matrices and then interpreting such matrices as points on Riemannian manifolds can lead to increased classification performance. Taking into account manifold geometry is typically done via (1) embedding the manifolds in tangent spaces, or (2) embedding into Reproducing Kernel Hilbert Spaces (RKHS). While embedding into tangent spaces allows the use of existing Euclidean-based learning algorithms, manifold shape is only approximated which can cause loss of discriminatory information. The RKHS approach retains more of the manifold structure, but may require non-trivial effort to kernelise Euclidean-based learning algorithms. In contrast to the above approaches, in this paper we offer a novel solution that allows SPD matrices to be used with unmodified Euclidean-based learning algorithms, with the true manifold shape well-preserved. Specifically, we propose to project SPD matrices using a set of random projection hyperplanes over RKHS into a random projection space, which leads to representing each matrix as a vector of projection coefficients. Experiments on face recognition, person re-identification and texture classification show that the proposed approach outperforms several recent methods, such as Tensor Sparse Coding, Histogram Plus Epitome, Riemannian Locality Preserving Projection and Relational Divergence Classification.
Resumo:
This paper describes a novel system for automatic classification of images obtained from Anti-Nuclear Antibody (ANA) pathology tests on Human Epithelial type 2 (HEp-2) cells using the Indirect Immunofluorescence (IIF) protocol. The IIF protocol on HEp-2 cells has been the hallmark method to identify the presence of ANAs, due to its high sensitivity and the large range of antigens that can be detected. However, it suffers from numerous shortcomings, such as being subjective as well as time and labour intensive. Computer Aided Diagnostic (CAD) systems have been developed to address these problems, which automatically classify a HEp-2 cell image into one of its known patterns (eg. speckled, homogeneous). Most of the existing CAD systems use handpicked features to represent a HEp-2 cell image, which may only work in limited scenarios. We propose a novel automatic cell image classification method termed Cell Pyramid Matching (CPM), which is comprised of regional histograms of visual words coupled with the Multiple Kernel Learning framework. We present a study of several variations of generating histograms and show the efficacy of the system on two publicly available datasets: the ICPR HEp-2 cell classification contest dataset and the SNPHEp-2 dataset.
Resumo:
Existing multi-model approaches for image set classification extract local models by clustering each image set individually only once, with fixed clusters used for matching with other image sets. However, this may result in the two closest clusters to represent different characteristics of an object, due to different undesirable environmental conditions (such as variations in illumination and pose). To address this problem, we propose to constrain the clustering of each query image set by forcing the clusters to have resemblance to the clusters in the gallery image sets. We first define a Frobenius norm distance between subspaces over Grassmann manifolds based on reconstruction error. We then extract local linear subspaces from a gallery image set via sparse representation. For each local linear subspace, we adaptively construct the corresponding closest subspace from the samples of a probe image set by joint sparse representation. We show that by minimising the sparse representation reconstruction error, we approach the nearest point on a Grassmann manifold. Experiments on Honda, ETH-80 and Cambridge-Gesture datasets show that the proposed method consistently outperforms several other recent techniques, such as Affine Hull based Image Set Distance (AHISD), Sparse Approximated Nearest Points (SANP) and Manifold Discriminant Analysis (MDA).
Resumo:
Time series classification has been extensively explored in many fields of study. Most methods are based on the historical or current information extracted from data. However, if interest is in a specific future time period, methods that directly relate to forecasts of time series are much more appropriate. An approach to time series classification is proposed based on a polarization measure of forecast densities of time series. By fitting autoregressive models, forecast replicates of each time series are obtained via the bias-corrected bootstrap, and a stationarity correction is considered when necessary. Kernel estimators are then employed to approximate forecast densities, and discrepancies of forecast densities of pairs of time series are estimated by a polarization measure, which evaluates the extent to which two densities overlap. Following the distributional properties of the polarization measure, a discriminant rule and a clustering method are proposed to conduct the supervised and unsupervised classification, respectively. The proposed methodology is applied to both simulated and real data sets, and the results show desirable properties.
Resumo:
The proliferation of news reports published in online websites and news information sharing among social media users necessitates effective techniques for analysing the image, text and video data related to news topics. This paper presents the first study to classify affective facial images on emerging news topics. The proposed system dynamically monitors and selects the current hot (of great interest) news topics with strong affective interestingness using textual keywords in news articles and social media discussions. Images from the selected hot topics are extracted and classified into three categorized emotions, positive, neutral and negative, based on facial expressions of subjects in the images. Performance evaluations on two facial image datasets collected from real-world resources demonstrate the applicability and effectiveness of the proposed system in affective classification of facial images in news reports. Facial expression shows high consistency with the affective textual content in news reports for positive emotion, while only low correlation has been observed for neutral and negative. The system can be directly used for applications, such as assisting editors in choosing photos with a proper affective semantic for a certain topic during news report preparation.
Resumo:
Next Generation Sequencing (NGS) has revolutionised molecular biology, resulting in an explosion of data sets and an increasing role in clinical practice. Such applications necessarily require rapid identification of the organism as a prelude to annotation and further analysis. NGS data consist of a substantial number of short sequence reads, given context through downstream assembly and annotation, a process requiring reads consistent with the assumed species or species group. Highly accurate results have been obtained for restricted sets using SVM classifiers, but such methods are difficult to parallelise and success depends on careful attention to feature selection. This work examines the problem at very large scale, using a mix of synthetic and real data with a view to determining the overall structure of the problem and the effectiveness of parallel ensembles of simpler classifiers (principally random forests) in addressing the challenges of large scale genomics.
Resumo:
Abstract Within the field of Information Systems, a good proportion of research is concerned with the work organisation and this has, to some extent, restricted the kind of application areas given consideration. Yet, it is clear that information and communication technology deployments beyond the work organisation are acquiring increased importance in our lives. With this in mind, we offer a field study of the appropriation of an online play space known as Habbo Hotel. Habbo Hotel, as a site of media convergence, incorporates social networking and digital gaming functionality. Our research highlights the ethical problems such a dual classification of technology may bring. We focus upon a particular set of activities undertaken within and facilitated by the space – scamming. Scammers dupe members with respect to their ‘Furni’, virtual objects that have online and offline economic value. Through our analysis we show that sometimes, online activities are bracketed off from those defined as offline and that this can be related to how the technology is classified by members – as a social networking site and/or a digital game. In turn, this may affect members’ beliefs about rights and wrongs. We conclude that given increasing media convergence, the way forward is to continue the project of educating people regarding the difficulties of determining rights and wrongs, and how rights and wrongs may be acted out with respect to new technologies of play online and offline.
Resumo:
This paper is about localising across extreme lighting and weather conditions. We depart from the traditional point-feature-based approach as matching under dramatic appearance changes is a brittle and hard thing. Point feature detectors are fixed and rigid procedures which pass over an image examining small, low-level structure such as corners or blobs. They apply the same criteria applied all images of all places. This paper takes a contrary view and asks what is possible if instead we learn a bespoke detector for every place. Our localisation task then turns into curating a large bank of spatially indexed detectors and we show that this yields vastly superior performance in terms of robustness in exchange for a reduced but tolerable metric precision. We present an unsupervised system that produces broad-region detectors for distinctive visual elements, called scene signatures, which can be associated across almost all appearance changes. We show, using 21km of data collected over a period of 3 months, that our system is capable of producing metric localisation estimates from night-to-day or summer-to-winter conditions.
Resumo:
The noble idea of studying seminal works to ‘see what we can learn’ has turned in the 1990s into ‘let’s see what we can take’ and in the last decade a more toxic derivative ‘what else can’t we take’. That is my observation as a student of architecture in the 1990s, and as a practitioner in the 2000s. In 2010, the sense that something is ending is clear. The next generation is rising and their gaze has shifted. The idea of classification (as a means of separation) was previously rejected by a generation of Postmodernists; the usefulness of difference declined. It’s there in the presence of plurality in the resulting architecture, a decision to mine history and seize in a willful manner. This is a process of looking back but never forward. It has been a mono-culture of absorption. The mono-culture rejected the pursuit of the realistic. It is a blanket suffocating all practice of architecture in this country from the mercantile to the intellectual. Independent reviews of Australia’s recent contributions to the Venice Architecture Biennales confirm the malaise. The next generation is beginning to reconsider classification as a means of unification. By acknowledging the characteristics of competing forces it is possible to bring them into a state of tension. Seeking a beautiful contrast is a means to a new end. In the political setting, this is described by Noel Pearson as the radical centre[1]. The concept transcends the political and in its most essential form is a cultural phenomenon. It resists the compromised position and suggests that we can look back while looking forward. The radical centre is the only demonstrated opportunity where it is possible to pursue a realistic architecture. A realistic architecture in Australia may be partially resolved by addressing our anxiety of permanence. Farrelly’s built desires[2] and Markham’s ritual demonstrations[3] are two ways into understanding the broader spectrum of permanence. But I think they are downstream of our core problem. Our problem, as architects, is that we are yet to come to terms with this place. Some call it landscape others call it country. Australian cities were laid out on what was mistaken for a blank canvas. On some occasions there was the consideration of the landscape when it presented insurmountable physical obstacles. The architecture since has continued to work on its piece of a constantly blank canvas. Even more ironic is the commercial awards programs that represent a claim within this framework but at best can only establish a dialogue within itself. This is a closed system unable to look forward. It is said that Melbourne is the most European city in the southern hemisphere but what is really being described there is the limitation of a senseless grid. After all, if Dutch landscape informs Dutch architecture why can’t the Australian landscape inform Australian architecture? To do that, we would have to acknowledge our moribund grasp of the meaning of the Australian landscape. Or more precisely what Indigenes call Country[4]. This is a complex notion and there are different ways into it. Country is experienced and understood through the senses and seared into memory. If one begins design at that starting point it is not unreasonable to think we can arrive at an end point that is a counter trajectory to where we have taken ourselves. A recent studio with Masters students confirmed this. Start by finding Country and it would be impossible to end up with a building looking like an Aboriginal man’s face. To date architecture in Australia has overwhelmingly ignored Country on the back of terra nullius. It can’t seem to get past the picturesque. Why is it so hard? The art world came to terms with this challenge, so too did the legal establishment, even the political scene headed into new waters. It would be easy to blame the budgets of commerce or the constraints of program or even the pressure of success. But that is too easy. Those factors are in fact the kind of limitations that opportunities grow out of. The past decade of economic plenty has, for the most part, smothered the idea that our capitals might enable civic settings or an architecture that is able to looks past lot line boundaries in a dignified manner. The denied opportunities of these settings to be prompted by the Country they occupy is criminal. The public realm is arrested in its development because we refuse to accept Country as a spatial condition. What we seem to be able to embrace is literal and symbolic gestures usually taking the form of a trumped up art installations. All talk – no action. To continue to leave the public realm to the stewardship of mercantile interests is like embracing derivative lending after the global financial crisis.Herein rests an argument for why we need a resourced Government Architect’s office operating not as an isolated lobbyist for business but as a steward of the public realm for both the past and the future. New South Wales is the leading model with Queensland close behind. That is not to say both do not have flaws but current calls for their cessation on the grounds of design parity poorly mask commercial self interest. In Queensland, lobbyists are heavily regulated now with an aim to ensure integrity and accountability. In essence, what I am speaking of will not be found in Reconciliation Action Plans that double as business plans, or the mining of Aboriginal culture for the next marketing gimmick, or even discussions around how to make buildings more ‘Aboriginal’. It will come from the next generation who reject the noxious mono-culture of absorption and embrace a counter trajectory to pursue an architecture of realism.
Resumo:
Determination of sequence similarity is a central issue in computational biology, a problem addressed primarily through BLAST, an alignment based heuristic which has underpinned much of the analysis and annotation of the genomic era. Despite their success, alignment-based approaches scale poorly with increasing data set size, and are not robust under structural sequence rearrangements. Successive waves of innovation in sequencing technologies – so-called Next Generation Sequencing (NGS) approaches – have led to an explosion in data availability, challenging existing methods and motivating novel approaches to sequence representation and similarity scoring, including adaptation of existing methods from other domains such as information retrieval. In this work, we investigate locality-sensitive hashing of sequences through binary document signatures, applying the method to a bacterial protein classification task. Here, the goal is to predict the gene family to which a given query protein belongs. Experiments carried out on a pair of small but biologically realistic datasets (the full protein repertoires of families of Chlamydia and Staphylococcus aureus genomes respectively) show that a measure of similarity obtained by locality sensitive hashing gives highly accurate results while offering a number of avenues which will lead to substantial performance improvements over BLAST..