Biblioteca Digital

842 resultados para EUNIS classification

A Survey on Graphical Methods for Classification Predictive Performance Evaluation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Predictive performance evaluation is a fundamental issue in design, development, and deployment of classification systems. As predictive performance evaluation is a multidimensional problem, single scalar summaries such as error rate, although quite convenient due to its simplicity, can seldom evaluate all the aspects that a complete and reliable evaluation must consider. Due to this, various graphical performance evaluation methods are increasingly drawing the attention of machine learning, data mining, and pattern recognition communities. The main advantage of these types of methods resides in their ability to depict the trade-offs between evaluation aspects in a multidimensional space rather than reducing these aspects to an arbitrarily chosen (and often biased) single scalar measure. Furthermore, to appropriately select a suitable graphical method for a given task, it is crucial to identify its strengths and weaknesses. This paper surveys various graphical methods often used for predictive performance evaluation. By presenting these methods in the same framework, we hope this paper may shed some light on deciding which methods are more suitable to use in different situations.

Bayesian network classifiers: Beyond classification accuracy

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work proposes and discusses an approach for inducing Bayesian classifiers aimed at balancing the tradeoff between the precise probability estimates produced by time consuming unrestricted Bayesian networks and the computational efficiency of Naive Bayes (NB) classifiers. The proposed approach is based on the fundamental principles of the Heuristic Search Bayesian network learning. The Markov Blanket concept, as well as a proposed ""approximate Markov Blanket"" are used to reduce the number of nodes that form the Bayesian network to be induced from data. Consequently, the usually high computational cost of the heuristic search learning algorithms can be lessened, while Bayesian network structures better than NB can be achieved. The resulting algorithms, called DMBC (Dynamic Markov Blanket Classifier) and A-DMBC (Approximate DMBC), are empirically assessed in twelve domains that illustrate scenarios of particular interest. The obtained results are compared with NB and Tree Augmented Network (TAN) classifiers, and confinn that both proposed algorithms can provide good classification accuracies and better probability estimates than NB and TAN, while being more computationally efficient than the widely used K2 Algorithm.

On the influence of imputation in classification: practical issues

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The substitution of missing values, also called imputation, is an important data preparation task for many domains. Ideally, the substitution of missing values should not insert biases into the dataset. This aspect has been usually assessed by some measures of the prediction capability of imputation methods. Such measures assume the simulation of missing entries for some attributes whose values are actually known. These artificially missing values are imputed and then compared with the original values. Although this evaluation is useful, it does not allow the influence of imputed values in the ultimate modelling task (e.g. in classification) to be inferred. We argue that imputation cannot be properly evaluated apart from the modelling task. Thus, alternative approaches are needed. This article elaborates on the influence of imputed values in classification. In particular, a practical procedure for estimating the inserted bias is described. As an additional contribution, we have used such a procedure to empirically illustrate the performance of three imputation methods (majority, naive Bayes and Bayesian networks) in three datasets. Three classifiers (decision tree, naive Bayes and nearest neighbours) have been used as modelling tools in our experiments. The achieved results illustrate a variety of situations that can take place in the data preparation practice.

Poly-bagging predictors for classification modelling for credit scoring

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Credit scoring modelling comprises one of the leading formal tools for supporting the granting of credit. Its core objective consists of the generation of a score by means of which potential clients can be listed in the order of the probability of default. A critical factor is whether a credit scoring model is accurate enough in order to provide correct classification of the client as a good or bad payer. In this context the concept of bootstraping aggregating (bagging) arises. The basic idea is to generate multiple classifiers by obtaining the predicted values from the fitted models to several replicated datasets and then combining them into a single predictive classification in order to improve the classification accuracy. In this paper we propose a new bagging-type variant procedure, which we call poly-bagging, consisting of combining predictors over a succession of resamplings. The study is derived by credit scoring modelling. The proposed poly-bagging procedure was applied to some different artificial datasets and to a real granting of credit dataset up to three successions of resamplings. We observed better classification accuracy for the two-bagged and the three-bagged models for all considered setups. These results lead to a strong indication that the poly-bagging approach may promote improvement on the modelling performance measures, while keeping a flexible and straightforward bagging-type structure easy to implement. (C) 2011 Elsevier Ltd. All rights reserved.

Classification of quantum relativistic orientable objects

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Extending our previous work `Fields on the Poincare group and quantum description of orientable objects` (Gitman and Shelepin 2009 Eur. Phys. J. C 61 111-39), we consider here a classification of orientable relativistic quantum objects in 3 + 1 dimensions. In such a classification, one uses a maximal set of ten commuting operators (generators of left and right transformations) in the space of functions on the Poincare group. In addition to the usual six quantum numbers related to external symmetries (given by left generators), there appear additional quantum numbers related to internal symmetries (given by right generators). Spectra of internal and external symmetry operators are interrelated, which, however, does not contradict the Coleman-Mandula no-go theorem. We believe that the proposed approach can be useful for the description of elementary spinning particles considered as orientable objects. In particular, it gives a group-theoretical interpretation of some facts of the existing phenomenological classification of spinning particles.

Texture analysis and classification using deterministic tourist walk

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present a study on a deterministic partially self-avoiding walk (tourist walk), which provides a novel method for texture feature extraction. The method is able to explore an image on all scales simultaneously. Experiments were conducted using different dynamics concerning the tourist walk. A new strategy, based on histograms. to extract information from its joint probability distribution is presented. The promising results are discussed and compared to the best-known methods for texture description reported in the literature. (C) 2009 Elsevier Ltd. All rights reserved.

Shape classification using complex network and Multi-scale Fractal Dimension

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Shape provides one of the most relevant information about an object. This makes shape one of the most important visual attributes used to characterize objects. This paper introduces a novel approach for shape characterization, which combines modeling shape into a complex network and the analysis of its complexity in a dynamic evolution context. Descriptors computed through this approach show to be efficient in shape characterization, incorporating many characteristics, such as scale and rotation invariant. Experiments using two different shape databases (an artificial shapes database and a leaf shape database) are presented in order to evaluate the method. and its results are compared to traditional shape analysis methods found in literature. (C) 2009 Published by Elsevier B.V.

Concentric characterization and classification of complex network nodes: Application to an institutional collaboration network

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Differently from theoretical scale-free networks, most real networks present multi-scale behavior, with nodes structured in different types of functional groups and communities. While the majority of approaches for classification of nodes in a complex network has relied on local measurements of the topology/connectivity around each node, valuable information about node functionality can be obtained by concentric (or hierarchical) measurements. This paper extends previous methodologies based on concentric measurements, by studying the possibility of using agglomerative clustering methods, in order to obtain a set of functional groups of nodes, considering particular institutional collaboration network nodes, including various known communities (departments of the University of Sao Paulo). Among the interesting obtained findings, we emphasize the scale-free nature of the network obtained, as well as identification of different patterns of authorship emerging from different areas (e.g. human and exact sciences). Another interesting result concerns the relatively uniform distribution of hubs along concentric levels, contrariwise to the non-uniform pattern found in theoretical scale-free networks such as the BA model. (C) 2008 Elsevier B.V. All rights reserved.

The CATH classification revisited-architectures reviewed and new ways to characterize structural divergence in superfamilies

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The latest version of CATH (class, architecture, topology, homology) (version 3.2), released in July 2008 (http://www.cathdb.info), contains 1 14215 domains, 2178 Homologous superfamilies and 1110 fold groups. We have assigned 20 330 new domains, 87 new homologous superfamilies and 26 new folds since CATH release version 3.1. A total of 28 064 new domains have been assigned since our NAR 2007 database publication (CATH version 3.0). The CATH website has been completely redesigned and includes more comprehensive documentation. We have revisited the CATH architecture level as part of the development of a `Protein Chart` and present information on the population of each architecture. The CATHEDRAL structure comparison algorithm has been improved and used to characterize structural diversity in CATH superfamilies and structural overlaps between superfamilies. Although the majority of superfamilies in CATH are not structurally diverse and do not overlap significantly with other superfamilies, similar to 4% of superfamilies are very diverse and these are the superfamilies that are most highly populated in both the PDB and in the genomes. Information on the degree of structural diversity in each superfamily and structural overlaps between superfamilies can now be downloaded from the CATH website.

The three-dimensional structure of bothropasin, the main hemorrhagic factor from Bothrops jararaca venom: Insights for a new classification of snake venom metalloprotease subgroups

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bothropasin is a 48 kDa hemorrhagic PIII snake venom metalloprotease (SVMP) isolated from Bothrops jararaca, containing disintegrin/cysteine-rich adhesive domains. Here we present the crystal structure of bothropasin complexed with the inhibitor POL647. The catalytic domain consists of a scaffold of two subdomains organized similarly to those described for other SVMPs, including the zinc and calcium-binding sites. The free cysteine residue Cys(189) is located within a hydrophobic core and it is not available for disulfide bonding or other interactions. There is no identifiable secondary structure for the disintegrin domain, but instead it is composed mostly of loops stabilized by seven disulfide bonds and by two calcium ions. The ECD region is in a loop and is structurally related to the RGD region of RGD disintegrins, which are derived from I`ll SVMPs. The ECD motif is stabilized by the Cys(117)_Cys(310) disulfide bond (between the disintegrin and cysteine-rich domains) and by one calcium ion. The side chain of Glu(276) of the ECD motif is exposed to solvent and free to make interactions. In bothropasin, the HVR (hyper-variable region) described for other Pill SVMPs in the cysteine-rich domain, presents a well-conserved sequence with respect to several other Pill members from different species. We propose that this subset be referred to as PIII-HCR (highly conserved region) SVMPs. The differences in the disintegrin-like, cysteine-rich or disintegrin-like cysteine-rich domains may be involved in selecting target binding, which in turn could generate substrate diversity or specificity for the catalytic domain. (C) 2008 Elsevier Ltd. All rights reserved.

A novel MAP-MRF approach for multispectral image contextual classification using combination of suboptimal iterative algorithms

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a novel approach for multispectral image contextual classification by combining iterative combinatorial optimization algorithms. The pixel-wise decision rule is defined using a Bayesian approach to combine two MRF models: a Gaussian Markov Random Field (GMRF) for the observations (likelihood) and a Potts model for the a priori knowledge, to regularize the solution in the presence of noisy data. Hence, the classification problem is stated according to a Maximum a Posteriori (MAP) framework. In order to approximate the MAP solution we apply several combinatorial optimization methods using multiple simultaneous initializations, making the solution less sensitive to the initial conditions and reducing both computational cost and time in comparison to Simulated Annealing, often unfeasible in many real image processing applications. Markov Random Field model parameters are estimated by Maximum Pseudo-Likelihood (MPL) approach, avoiding manual adjustments in the choice of the regularization parameters. Asymptotic evaluations assess the accuracy of the proposed parameter estimation procedure. To test and evaluate the proposed classification method, we adopt metrics for quantitative performance assessment (Cohen`s Kappa coefficient), allowing a robust and accurate statistical analysis. The obtained results clearly show that combining sub-optimal contextual algorithms significantly improves the classification performance, indicating the effectiveness of the proposed methodology. (C) 2010 Elsevier B.V. All rights reserved.

The classification and the conjugacy classes of the finite subgroups of the sphere braid groups

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Let n >= 3. We classify the finite groups which are realised as subgroups of the sphere braid group B(n)(S(2)). Such groups must be of cohomological period 2 or 4. Depending on the value of n, we show that the following are the maximal finite subgroups of B(n)(S(2)): Z(2(n-1)); the dicyclic groups of order 4n and 4(n - 2); the binary tetrahedral group T*; the binary octahedral group O*; and the binary icosahedral group I(*). We give geometric as well as some explicit algebraic constructions of these groups in B(n)(S(2)) and determine the number of conjugacy classes of such finite subgroups. We also reprove Murasugi`s classification of the torsion elements of B(n)(S(2)) and explain how the finite subgroups of B(n)(S(2)) are related to this classification, as well as to the lower central and derived series of B(n)(S(2)).

CLASSIFICATION OF SUBALGEBRAS OF THE CAYLEY ALGEBRA OVER A FINITE FIELD

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We classify all unital subalgebras of the Cayley algebra O(q) over the finite field F(q), q = p(n). We obtain the number of subalgebras of each type and prove that all isomorphic subalgebras are conjugate with respect to the automorphism group of O(q). We also determine the structure of the Moufang loops associated with each subalgebra of O(q).

Classification of the virtually cyclic subgroups of the pure braid groups of the projective plane

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We classify the ( finite and infinite) virtually cyclic subgroups of the pure braid groups P(n)(RP(2)) of the projective plane. The maximal finite subgroups of P(n)(RP(2)) are isomorphic to the quaternion group of order 8 if n = 3, and to Z(4) if n >= 4. Further, for all n >= 3, the following groups are, up to isomorphism, the infinite virtually cyclic subgroups of P(n)(RP(2)): Z, Z(2) x Z and the amalgamated product Z(4)*(Z2)Z(4).

Enhanced classification of Chagas serologic results and epidemiologic characteristics of seropositive donors at three large blood centers in Brazil

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: A major problem in Chagas disease donor screening is the high frequency of samples with inconclusive results. The objective of this study was to describe patterns of serologic results among donors to the three Brazilian REDS-II blood centers and correlate with epidemiologic characteristics. STUDY DESIGN AND METHODS: The centers screened donor samples with one Trypanosoma cruzi lysate enzyme immunoassay (EIA). EIA-reactive samples were tested with a second lysate EIA, a recombinant-antigen based EIA, and an immunfluorescence assay. Based on the serologic results, samples were classified as confirmed positive (CP), probable positive (PP), possible other parasitic infection (POPI), and false positive (FP). RESULTS: In 2007 to 2008, a total of 877 of 615,433 donations were discarded due to Chagas assay reactivity. The prevalences (95% confidence intervals [CIs]) among first-time donors for CP, PP, POPI, and FP patterns were 114 (99-129), 26 (19-34), 10 (5-14), and 96 (82-110) per 100,000 donations, respectively. CP and PP had similar patterns of prevalence when analyzed by age, sex, education, and location, suggesting that PP cases represent true T. cruzi infections; in contrast the demographics of donors with POPI were distinct and likely unrelated to Chagas disease. No CP cases were detected among 218,514 repeat donors followed for a total of 718,187 person-years. CONCLUSION: We have proposed a classification algorithm that may have practical importance for donor counseling and epidemiologic analyses of T. cruzi-seroreactive donors. The absence of incident T. cruzi infections is reassuring with respect to risk of window phase infections within Brazil and travel-related infections in nonendemic countries such as the United States.

«
1
2
...
49
50
51
52
53
54
55
56
57
»