1000 resultados para CLASSIFICATION


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problem of structured classification, where the task is to predict a label y from an input x, and y has meaningful internal structure. Our framework includes supervised training of Markov random fields and weighted context-free grammars as special cases. We describe an algorithm that solves the large-margin optimization problem defined in [12], using an exponential-family (Gibbs distribution) representation of structured objects. The algorithm is efficient—even in cases where the number of labels y is exponential in size—provided that certain expectations under Gibbs distributions can be calculated efficiently. The method for structured labels relies on a more general result, specifically the application of exponentiated gradient updates [7, 8] to quadratic programs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Strategies for cancer reduction and management are targeted at both individual and area levels. Area-level strategies require careful understanding of geographic differences in cancer incidence, in particular the association with factors such as socioeconomic status, ethnicity and accessibility. This study aimed to identify the complex interplay of area-level factors associated with high area-specific incidence of Australian priority cancers using a classification and regression tree (CART) approach. Methods: Area-specific smoothed standardised incidence ratios were estimated for priority-area cancers across 478 statistical local areas in Queensland, Australia (1998-2007, n=186,075). For those cancers with significant spatial variation, CART models were used to identify whether area-level accessibility, socioeconomic status and ethnicity were associated with high area-specific incidence. Results: The accessibility of a person’s residence had the most consistent association with the risk of cancer diagnosis across the specific cancers. Many cancers were likely to have high incidence in more urban areas, although male lung cancer and cervical cancer tended to have high incidence in more remote areas. The impact of socioeconomic status and ethnicity on these associations differed by type of cancer. Conclusions: These results highlight the complex interactions between accessibility, socioeconomic status and ethnicity in determining cancer incidence risk.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Follicle classification is an important aid to the understanding of follicular development and atresia. Some bovine primordial follicles have the classical primordial shape, but ellipsoidal shaped follicles with some cuboidal granulosa cells at the poles are far more common. Preantral follicles have one of two basal lamina phenotypes, either a single aligned layer or one with additional layers. In antral follicles <5 mm diameter, half of the healthy follicles have columnar shaped basal granulosa cells and additional layers of basal lamina, which appear as loops in cross section (‘loopy’). The remainder have aligned single-layered follicular basal laminas with rounded basal cells, and contain better quality oocytes than the loopy/columnar follicles. In sizes >5 mm, only aligned/rounded phenotypes are present. Dominant and subordinate follicles can be identified by ultrasound and/or histological examination of pairs of ovaries. Atretic follicles <5 mm are either basal atretic or antral atretic, named on the basis of the location in the membrana granulosa where cells die first. Basal atretic follicles have considerable biological differences to antral atretic follicles. In follicles >5 mm, only antral atresia is observed. The concentrations of follicular fluid steroid hormones can be used to classify atresia and distinguish some of the different types of atresia; however, this method is unlikely to identify follicles early in atresia, and hence misclassify them as healthy. Other biochemical and histological methods can be used, but since cell death is a part of normal homoeostatis, deciding when a follicle has entered atresia remains somewhat subjective.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The most common human cancers are malignant neoplasms of the skin. Incidence of cutaneous melanoma is rising especially steeply, with minimal progress in non-surgical treatment of advanced disease. Despite significant effort to identify independent predictors of melanoma outcome, no accepted histopathological, molecular or immunohistochemical marker defines subsets of this neoplasm. Accordingly, though melanoma is thought to present with different 'taxonomic' forms, these are considered part of a continuous spectrum rather than discrete entities. Here we report the discovery of a subset of melanomas identified by mathematical analysis of gene expression in a series of samples. Remarkably, many genes underlying the classification of this subset are differentially regulated in invasive melanomas that form primitive tubular networks in vitro, a feature of some highly aggressive metastatic melanomas. Global transcript analysis can identify unrecognized subtypes of cutaneous melanoma and predict experimentally verifiable phenotypic characteristics that may be of importance to disease progression.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most learning paradigms impose a particular syntax on the class of concepts to be learned; the chosen syntax can dramatically affect whether the class is learnable or not. For classification paradigms, where the task is to determine whether the underlying world does or does not have a particular property, how that property is represented has no implication on the power of a classifier that just outputs 1’s or 0’s. But is it possible to give a canonical syntactic representation of the class of concepts that are classifiable according to the particular criteria of a given paradigm? We provide a positive answer to this question for classification in the limit paradigms in a logical setting, with ordinal mind change bounds as a measure of complexity. The syntactic characterization that emerges enables to derive that if a possibly noncomputable classifier can perform the task assigned to it by the paradigm, then a computable classifier can also perform the same task. The syntactic characterization is strongly related to the difference hierarchy over the class of open sets of some topological space; this space is naturally defined from the class of possible worlds and possible data of the learning paradigm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many existing schemes for malware detection are signature-based. Although they can effectively detect known malwares, they cannot detect variants of known malwares or new ones. Most network servers do not expect executable code in their in-bound network traffic, such as on-line shopping malls, Picasa, Youtube, Blogger, etc. Therefore, such network applications can be protected from malware infection by monitoring their ports to see if incoming packets contain any executable contents. This paper proposes a content-classification scheme that identifies executable content in incoming packets. The proposed scheme analyzes the packet payload in two steps. It first analyzes the packet payload to see if it contains multimedia-type data (such as . If not, then it classifies the payload either as text-type (such as or executable. Although in our experiments the proposed scheme shows a low rate of false negatives and positives (4.69% and 2.53%, respectively), the presence of inaccuracies still requires further inspection to efficiently detect the occurrence of malware. In this paper, we also propose simple statistical and combinatorial analysis to deal with false positives and negatives.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

People interact with mobile computing devices everywhere, while sitting, walking, running or even driving. Adapting the interface to suit these contexts is important, thus this paper proposes a simple human activity classification system. Our approach uses a vector magnitude recognition technique to detect and classify when a person is stationary (or not walking), casually walking, or jogging, without any prior training. The user study has confirmed the accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper considers issues of methodological innovation in communication, media and cultural studies, that arise out of the extent to which we now live in a media environment characterised by an digital media abundance, the convergence of media platforms, content and services, and the globalisation of media content through ubiquitous computing and high-speed broadband networks. These developments have also entailed a shift in the producer-consumer relationships that characterised the 20th century mass communications paradigm, with the rapid proliferation of user-created content, accelerated innovation, the growing empowerment of media users themselves, and the blurring of distinctions between public and private, as well as age-based distinctions in terms of what media can be accessed by whom and for what purpose. It considers these issues through a case study of the Australian Law Reform Commission's National Classification Scheme Review.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – The work presented in this paper aims to provide an approach to classifying web logs by personal properties of users. Design/methodology/approach – The authors describe an iterative system that begins with a small set of manually labeled terms, which are used to label queries from the log. A set of background knowledge related to these labeled queries is acquired by combining web search results on these queries. This background set is used to obtain many terms that are related to the classification task. The system then ranks each of the related terms, choosing those that most fit the personal properties of the users. These terms are then used to begin the next iteration. Findings – The authors identify the difficulties of classifying web logs, by approaching this problem from a machine learning perspective. By applying the approach developed, the authors are able to show that many queries in a large query log can be classified. Research limitations/implications – Testing results in this type of classification work is difficult, as the true personal properties of web users are unknown. Evaluation of the classification results in terms of the comparison of classified queries to well known age-related sites is a direction that is currently being exploring. Practical implications – This research is background work that can be incorporated in search engines or other web-based applications, to help marketing companies and advertisers. Originality/value – This research enhances the current state of knowledge in short-text classification and query log learning. Classification schemes, Computer networks, Information retrieval, Man-machine systems, User interfaces

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we describe the main processes and operations in mining industries and present a comprehensive survey of operations research methodologies that have been applied over the last several decades. The literature review is classified into four main categories: mine design; mine production; mine transportation; and mine evaluation. Mining design models are further separated according to two main mining methods: open-pit and underground. Moreover, mine production models are subcategorised into two groups: ore mining and coal mining. Mine transportation models are further partitioned in accordance with fleet management, truck haulage and train scheduling. Mine evaluation models are further subdivided into four clusters in terms of mining method selection, quality control, financial risks and environmental protection. The main characteristics of four Australian commercial mining software are addressed and compared. This paper bridges the gaps in the literature and motivates researchers to develop more applicable, realistic and comprehensive operations research models and solution techniques that are directly linked with mining industries.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is a big challenge to acquire correct user profiles for personalized text classification since users may be unsure in providing their interests. Traditional approaches to user profiling adopt machine learning (ML) to automatically discover classification knowledge from explicit user feedback in describing personal interests. However, the accuracy of ML-based methods cannot be significantly improved in many cases due to the term independence assumption and uncertainties associated with them. This paper presents a novel relevance feedback approach for personalized text classification. It basically applies data mining to discover knowledge from relevant and non-relevant text and constraints specific knowledge by reasoning rules to eliminate some conflicting information. We also developed a Dempster-Shafer (DS) approach as the means to utilise the specific knowledge to build high-quality data models for classification. The experimental results conducted on Reuters Corpus Volume 1 and TREC topics support that the proposed technique achieves encouraging performance in comparing with the state-of-the-art relevance feedback models.