1000 resultados para Fishes - Classification
Resumo:
Workflow nets, a particular class of Petri nets, have become one of the standard ways to model and analyze workflows. Typically, they are used as an abstraction of the workflow that is used to check the so-called soundness property. This property guarantees the absence of livelocks, deadlocks, and other anomalies that can be detected without domain knowledge. Several authors have proposed alternative notions of soundness and have suggested to use more expressive languages, e.g., models with cancellations or priorities. This paper provides an overview of the different notions of soundness and investigates these in the presence of different extensions of workflow nets.We will show that the eight soundness notions described in the literature are decidable for workflow nets. However, most extensions will make all of these notions undecidable. These new results show the theoretical limits of workflow verification. Moreover, we discuss some of the analysis approaches described in the literature.
Resumo:
The use of appropriate features to characterise an output class or object is critical for all classification problems. In order to find optimal feature descriptors for vegetation species classification in a power line corridor monitoring application, this article evaluates the capability of several spectral and texture features. A new idea of spectral–texture feature descriptor is proposed by incorporating spectral vegetation indices in statistical moment features. The proposed method is evaluated against several classic texture feature descriptors. Object-based classification method is used and a support vector machine is employed as the benchmark classifier. Individual tree crowns are first detected and segmented from aerial images and different feature vectors are extracted to represent each tree crown. The experimental results showed that the proposed spectral moment features outperform or can at least compare with the state-of-the-art texture descriptors in terms of classification accuracy. A comprehensive quantitative evaluation using receiver operating characteristic space analysis further demonstrates the strength of the proposed feature descriptors.
Resumo:
Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the network. Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights. For example, consider a two-layer feedforward network of sigmoid units, in which the sum of the magnitudes of the weights associated with each unit is bounded by A and the input dimension is n. We show that the misclassification probability is no more than a certain error estimate (that is related to squared error on the training set) plus A3 √((log n)/m) (ignoring log A and log m factors), where m is the number of training patterns. This may explain the generalization performance of neural networks, particularly when the number of training examples is considerably smaller than the number of weights. It also supports heuristics (such as weight decay and early stopping) that attempt to keep the weights small during training. The proof techniques appear to be useful for the analysis of other pattern classifiers: when the input domain is a totally bounded metric space, we use the same approach to give upper bounds on misclassification probability for classifiers with decision boundaries that are far from the training examples.
Resumo:
Many of the classification algorithms developed in the machine learning literature, including the support vector machine and boosting, can be viewed as minimum contrast methods that minimize a convex surrogate of the 0–1 loss function. The convexity makes these algorithms computationally efficient. The use of a surrogate, however, has statistical consequences that must be balanced against the computational virtues of convexity. To study these issues, we provide a general quantitative relationship between the risk as assessed using the 0–1 loss and the risk as assessed using any nonnegative surrogate loss function. We show that this relationship gives nontrivial upper bounds on excess risk under the weakest possible condition on the loss function—that it satisfies a pointwise form of Fisher consistency for classification. The relationship is based on a simple variational transformation of the loss function that is easy to compute in many applications. We also present a refined version of this result in the case of low noise, and show that in this case, strictly convex loss functions lead to faster rates of convergence of the risk than would be implied by standard uniform convergence arguments. Finally, we present applications of our results to the estimation of convergence rates in function classes that are scaled convex hulls of a finite-dimensional base class, with a variety of commonly used loss functions.
Resumo:
Binary classification methods can be generalized in many ways to handle multiple classes. It turns out that not all generalizations preserve the nice property of Bayes consistency. We provide a necessary and sufficient condition for consistency which applies to a large class of multiclass classification methods. The approach is illustrated by applying it to some multiclass methods proposed in the literature.
Resumo:
We consider the problem of binary classification where the classifier can, for a particular cost, choose not to classify an observation. Just as in the conventional classification problem, minimization of the sample average of the cost is a difficult optimization problem. As an alternative, we propose the optimization of a certain convex loss function φ, analogous to the hinge loss used in support vector machines (SVMs). Its convexity ensures that the sample average of this surrogate loss can be efficiently minimized. We study its statistical properties. We show that minimizing the expected surrogate loss—the φ-risk—also minimizes the risk. We also study the rate at which the φ-risk approaches its minimum value. We show that fast rates are possible when the conditional probability P(Y=1|X) is unlikely to be close to certain critical values.
Resumo:
Binary classification is a well studied special case of the classification problem. Statistical properties of binary classifiers, such as consistency, have been investigated in a variety of settings. Binary classification methods can be generalized in many ways to handle multiple classes. It turns out that one can lose consistency in generalizing a binary classification method to deal with multiple classes. We study a rich family of multiclass methods and provide a necessary and sufficient condition for their consistency. We illustrate our approach by applying it to some multiclass methods proposed in the literature.
Resumo:
The purpose of this conceptual paper is to address the lack of consistent means through which strategies are identified and discussed across theoretical perspectives in the field of business strategy. A standardised referencing system is offered to codify the means by which strategies can be identified, from which new business services and information systems may be derived. This taxonomy was developed using qualitative content analysis study of government agencies’ strategic plans. This taxonomy is useful for identifying strategy formation and determining gaps and opportunities. Managers will benefit from a more transparent strategic design process that reduces ambiguity, aids in identifying and correcting gaps in strategy formulation, and fosters enhanced strategic analysis. Key benefits to academics are the improved dialogue in strategic management field and suggest that progress in the field requires that fundamentals of strategy formulation and classification be considered more carefully. Finally, the formalization of strategy can lead to the clear identification of new business services, which inform ICT investment decisions and shared service prioritisation.
Resumo:
We consider the problem of structured classification, where the task is to predict a label y from an input x, and y has meaningful internal structure. Our framework includes supervised training of Markov random fields and weighted context-free grammars as special cases. We describe an algorithm that solves the large-margin optimization problem defined in [12], using an exponential-family (Gibbs distribution) representation of structured objects. The algorithm is efficient—even in cases where the number of labels y is exponential in size—provided that certain expectations under Gibbs distributions can be calculated efficiently. The method for structured labels relies on a more general result, specifically the application of exponentiated gradient updates [7, 8] to quadratic programs.
Resumo:
Background: Strategies for cancer reduction and management are targeted at both individual and area levels. Area-level strategies require careful understanding of geographic differences in cancer incidence, in particular the association with factors such as socioeconomic status, ethnicity and accessibility. This study aimed to identify the complex interplay of area-level factors associated with high area-specific incidence of Australian priority cancers using a classification and regression tree (CART) approach. Methods: Area-specific smoothed standardised incidence ratios were estimated for priority-area cancers across 478 statistical local areas in Queensland, Australia (1998-2007, n=186,075). For those cancers with significant spatial variation, CART models were used to identify whether area-level accessibility, socioeconomic status and ethnicity were associated with high area-specific incidence. Results: The accessibility of a person’s residence had the most consistent association with the risk of cancer diagnosis across the specific cancers. Many cancers were likely to have high incidence in more urban areas, although male lung cancer and cervical cancer tended to have high incidence in more remote areas. The impact of socioeconomic status and ethnicity on these associations differed by type of cancer. Conclusions: These results highlight the complex interactions between accessibility, socioeconomic status and ethnicity in determining cancer incidence risk.
Resumo:
Follicle classification is an important aid to the understanding of follicular development and atresia. Some bovine primordial follicles have the classical primordial shape, but ellipsoidal shaped follicles with some cuboidal granulosa cells at the poles are far more common. Preantral follicles have one of two basal lamina phenotypes, either a single aligned layer or one with additional layers. In antral follicles <5 mm diameter, half of the healthy follicles have columnar shaped basal granulosa cells and additional layers of basal lamina, which appear as loops in cross section (‘loopy’). The remainder have aligned single-layered follicular basal laminas with rounded basal cells, and contain better quality oocytes than the loopy/columnar follicles. In sizes >5 mm, only aligned/rounded phenotypes are present. Dominant and subordinate follicles can be identified by ultrasound and/or histological examination of pairs of ovaries. Atretic follicles <5 mm are either basal atretic or antral atretic, named on the basis of the location in the membrana granulosa where cells die first. Basal atretic follicles have considerable biological differences to antral atretic follicles. In follicles >5 mm, only antral atresia is observed. The concentrations of follicular fluid steroid hormones can be used to classify atresia and distinguish some of the different types of atresia; however, this method is unlikely to identify follicles early in atresia, and hence misclassify them as healthy. Other biochemical and histological methods can be used, but since cell death is a part of normal homoeostatis, deciding when a follicle has entered atresia remains somewhat subjective.