55 resultados para User classification
Resumo:
Optical diagnostic methods, such as near-infrared Raman spectroscopy allow quantification and evaluation of human affecting diseases, which could be useful in identifying and diagnosing atherosclerosis in coronary arteries. The goal of the present work is to apply Independent Component Analysis (ICA) for data reduction and feature extraction of Raman spectra and to perform the Mahalanobis distance for group classification according to histopathology, obtaining feasible diagnostic information to detect atheromatous plaque. An 830nm Ti:sapphire laser pumped by an argon laser provides near-infrared excitation. A spectrograph disperses light scattered from arterial tissues over a liquid-nitrogen cooled CCD to detect the Raman spectra. A total of 111 spectra from arterial fragments were utilized.
Resumo:
This study presents the results of Raman spectroscopy applied to the classification of arterial tissue based on a simplified model using basal morphological and biochemical information extracted from the Raman spectra of arteries. The Raman spectrograph uses an 830-nm diode laser, imaging spectrograph, and a CCD camera. A total of 111 Raman spectra from arterial fragments were used to develop the model, and those spectra were compared to the spectra of collagen, fat cells, smooth muscle cells, calcification, and cholesterol in a linear fit model. Non-atherosclerotic (NA), fatty and fibrous-fatty atherosclerotic plaques (A) and calcified (C) arteries exhibited different spectral signatures related to different morphological structures presented in each tissue type. Discriminant analysis based on Mahalanobis distance was employed to classify the tissue type with respect to the relative intensity of each compound. This model was subsequently tested prospectively in a set of 55 spectra. The simplified diagnostic model showed that cholesterol, collagen, and adipocytes were the tissue constituents that gave the best classification capability and that those changes were correlated to histopathology. The simplified model, using spectra obtained from a few tissue morphological and biochemical constituents, showed feasibility by using a small amount of variables, easily extracted from gross samples.
Resumo:
Objective: The aim was to compare there ulcer classification systems as predictors of the outcome of diabetic foot ulcers; the Wagner, the University of Texas (UT) and the size (area, depth), sepsis, arteriopathy, denervation system (S(AD)SAD) systems in specialist clinic in Brazil. Methods: Ulcer area, depth, appearance, infection and associated ischaemia and neuropathy were recorded in a consecutive series of 94 subjects. A novel score, the S(AD)SAD score, was derived from the sum of individual items of the S(AD)SAD system, and was evaluated. Follow-up was for at least 6 months. The primary outcome measure was the incidence of healing. Results: Mean age was 57.6 years; 57 (60.6%) were made. Forty-eight ulcers (51.1%) healed without surgery; 11 (12.2%) subjects underwent minor amputation. Significant differences in terms of healing were observed for depth (P = 0.002), infection (P = 0.006) and denervation (P = 0.002) using the S(AD)SAD system, for UT grade (P = 0.002) and stage (P = 0.032) and for Wagner grades (P = 0.002). Ulcers with an S(AD)SAD score of <= 9 (total possible 15) were 7.6 times more likely to heal than scores >= 10 (P < 0.001). Conclusions: All three systems predicted ulcer outcome. The S(AD)SAD score of ulcer severity could represent a useful addition to routine clinical practice. The association between outcome and ulcer depth confirms earlier reports. The association with infection was stronger than that reported from the centres in Europe or North America. The very strong association with neuropathy has only previously been observed in Tanzania. Studies designed to compare the outcome in different countries should adopt systems of classification, which are valid for the populations studied.
Resumo:
Microarray gene expression profiling is a high-throughput system used to identify differentially expressed genes and regulation patterns, and to discover new tumor markers. As the molecular pathogenesis of meningiomas and schwannomas, characterized by NF2 gene alterations, remains unclear and suitable molecular targets need to be identified, we used low density cDNA microarrays to establish expression patterns of 96 cancer-related genes on 23 schwannomas, 42 meningiomas and 3 normal cerebral meninges. We also performed a mutational analysis of the NF2 gene (PCR, dHPLC, Sequencing and MLPA), a search for 22q LOH and an analysis of gene silencing by promoter hypermethylation (MS-MLPA). Results showed a high frequency of NF2 gene mutations (40%), increased 22q LOH as aggressiveness increased, frequent losses and gains by MLPA in benign meningiomas, and gene expression silencing by hypermethylation. Array analysis showed decreased expression of 7 genes in meningiomas. Unsupervised analyses identified 2 molecular subgroups for both meningiomas and schwannomas showing 38 and 20 differentially expressed genes, respectively, and 19 genes differentially expressed between the two tumor types. These findings provide a molecular subgroup classification for meningiomas and schwannomas with possible implications for clinical practice.
Resumo:
Epidendrum L. is the largest genus of Orchidaceae in the Neotropical region; it has an impressive morphological diversification, which imposes difficulties in delimitation of both infrageneric and interspecific boundaries. In this study, we review infrageneric boundaries within the subgenus Amphiglottium and try to contribute to the understanding of morphological diversification and taxa delimitation within this group. We tested the monophyly of the subgenus Amphiglottium sect. Amphiglottium, expanding previous phylogenetic investigations and reevaluated previous infrageneric classifications proposed. Sequence data from the trnL-trnF region were analyzed with both parsimony and maximum likelihood criteria. AFLP markers were also obtained and analyzed with phylogenetic and principal coordinate analyses. Additionally, we obtained chromosome numbers for representative species within the group. The results strengthen the monophyly of the subgenus Amphiglottium but do not support the current classification system proposed by previous authors. Only section Tuberculata comprises a well-supported monophyletic group, with sections Carinata and Integra not supported. Instead of morphology, biogeographical and ecological patterns are reflected in the phylogenetic signal in this group. This study also confirms the large variability of chromosome numbers for the subgenus Amphiglottium (numbers ranging from 2n = 24 to 2n = 240), suggesting that polyploidy and hybridization are probably important mechanisms of speciation within the group.
Resumo:
Predictive performance evaluation is a fundamental issue in design, development, and deployment of classification systems. As predictive performance evaluation is a multidimensional problem, single scalar summaries such as error rate, although quite convenient due to its simplicity, can seldom evaluate all the aspects that a complete and reliable evaluation must consider. Due to this, various graphical performance evaluation methods are increasingly drawing the attention of machine learning, data mining, and pattern recognition communities. The main advantage of these types of methods resides in their ability to depict the trade-offs between evaluation aspects in a multidimensional space rather than reducing these aspects to an arbitrarily chosen (and often biased) single scalar measure. Furthermore, to appropriately select a suitable graphical method for a given task, it is crucial to identify its strengths and weaknesses. This paper surveys various graphical methods often used for predictive performance evaluation. By presenting these methods in the same framework, we hope this paper may shed some light on deciding which methods are more suitable to use in different situations.
Resumo:
This work proposes and discusses an approach for inducing Bayesian classifiers aimed at balancing the tradeoff between the precise probability estimates produced by time consuming unrestricted Bayesian networks and the computational efficiency of Naive Bayes (NB) classifiers. The proposed approach is based on the fundamental principles of the Heuristic Search Bayesian network learning. The Markov Blanket concept, as well as a proposed ""approximate Markov Blanket"" are used to reduce the number of nodes that form the Bayesian network to be induced from data. Consequently, the usually high computational cost of the heuristic search learning algorithms can be lessened, while Bayesian network structures better than NB can be achieved. The resulting algorithms, called DMBC (Dynamic Markov Blanket Classifier) and A-DMBC (Approximate DMBC), are empirically assessed in twelve domains that illustrate scenarios of particular interest. The obtained results are compared with NB and Tree Augmented Network (TAN) classifiers, and confinn that both proposed algorithms can provide good classification accuracies and better probability estimates than NB and TAN, while being more computationally efficient than the widely used K2 Algorithm.
Resumo:
Multidimensional Visualization techniques are invaluable tools for analysis of structured and unstructured data with variable dimensionality. This paper introduces PEx-Image-Projection Explorer for Images-a tool aimed at supporting analysis of image collections. The tool supports a methodology that employs interactive visualizations to aid user-driven feature detection and classification tasks, thus offering improved analysis and exploration capabilities. The visual mappings employ similarity-based multidimensional projections and point placement to layout the data on a plane for visual exploration. In addition to its application to image databases, we also illustrate how the proposed approach can be successfully employed in simultaneous analysis of different data types, such as text and images, offering a common visual representation for data expressed in different modalities.
Resumo:
The substitution of missing values, also called imputation, is an important data preparation task for many domains. Ideally, the substitution of missing values should not insert biases into the dataset. This aspect has been usually assessed by some measures of the prediction capability of imputation methods. Such measures assume the simulation of missing entries for some attributes whose values are actually known. These artificially missing values are imputed and then compared with the original values. Although this evaluation is useful, it does not allow the influence of imputed values in the ultimate modelling task (e.g. in classification) to be inferred. We argue that imputation cannot be properly evaluated apart from the modelling task. Thus, alternative approaches are needed. This article elaborates on the influence of imputed values in classification. In particular, a practical procedure for estimating the inserted bias is described. As an additional contribution, we have used such a procedure to empirically illustrate the performance of three imputation methods (majority, naive Bayes and Bayesian networks) in three datasets. Three classifiers (decision tree, naive Bayes and nearest neighbours) have been used as modelling tools in our experiments. The achieved results illustrate a variety of situations that can take place in the data preparation practice.
Resumo:
We introduce a flexible technique for interactive exploration of vector field data through classification derived from user-specified feature templates. Our method is founded on the observation that, while similar features within the vector field may be spatially disparate, they share similar neighborhood characteristics. Users generate feature-based visualizations by interactively highlighting well-accepted and domain specific representative feature points. Feature exploration begins with the computation of attributes that describe the neighborhood of each sample within the input vector field. Compilation of these attributes forms a representation of the vector field samples in the attribute space. We project the attribute points onto the canonical 2D plane to enable interactive exploration of the vector field using a painting interface. The projection encodes the similarities between vector field points within the distances computed between their associated attribute points. The proposed method is performed at interactive rates for enhanced user experience and is completely flexible as showcased by the simultaneous identification of diverse feature types.
Resumo:
Credit scoring modelling comprises one of the leading formal tools for supporting the granting of credit. Its core objective consists of the generation of a score by means of which potential clients can be listed in the order of the probability of default. A critical factor is whether a credit scoring model is accurate enough in order to provide correct classification of the client as a good or bad payer. In this context the concept of bootstraping aggregating (bagging) arises. The basic idea is to generate multiple classifiers by obtaining the predicted values from the fitted models to several replicated datasets and then combining them into a single predictive classification in order to improve the classification accuracy. In this paper we propose a new bagging-type variant procedure, which we call poly-bagging, consisting of combining predictors over a succession of resamplings. The study is derived by credit scoring modelling. The proposed poly-bagging procedure was applied to some different artificial datasets and to a real granting of credit dataset up to three successions of resamplings. We observed better classification accuracy for the two-bagged and the three-bagged models for all considered setups. These results lead to a strong indication that the poly-bagging approach may promote improvement on the modelling performance measures, while keeping a flexible and straightforward bagging-type structure easy to implement. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
Extending our previous work `Fields on the Poincare group and quantum description of orientable objects` (Gitman and Shelepin 2009 Eur. Phys. J. C 61 111-39), we consider here a classification of orientable relativistic quantum objects in 3 + 1 dimensions. In such a classification, one uses a maximal set of ten commuting operators (generators of left and right transformations) in the space of functions on the Poincare group. In addition to the usual six quantum numbers related to external symmetries (given by left generators), there appear additional quantum numbers related to internal symmetries (given by right generators). Spectra of internal and external symmetry operators are interrelated, which, however, does not contradict the Coleman-Mandula no-go theorem. We believe that the proposed approach can be useful for the description of elementary spinning particles considered as orientable objects. In particular, it gives a group-theoretical interpretation of some facts of the existing phenomenological classification of spinning particles.
Resumo:
In this paper, we present a study on a deterministic partially self-avoiding walk (tourist walk), which provides a novel method for texture feature extraction. The method is able to explore an image on all scales simultaneously. Experiments were conducted using different dynamics concerning the tourist walk. A new strategy, based on histograms. to extract information from its joint probability distribution is presented. The promising results are discussed and compared to the best-known methods for texture description reported in the literature. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Shape provides one of the most relevant information about an object. This makes shape one of the most important visual attributes used to characterize objects. This paper introduces a novel approach for shape characterization, which combines modeling shape into a complex network and the analysis of its complexity in a dynamic evolution context. Descriptors computed through this approach show to be efficient in shape characterization, incorporating many characteristics, such as scale and rotation invariant. Experiments using two different shape databases (an artificial shapes database and a leaf shape database) are presented in order to evaluate the method. and its results are compared to traditional shape analysis methods found in literature. (C) 2009 Published by Elsevier B.V.
Resumo:
Differently from theoretical scale-free networks, most real networks present multi-scale behavior, with nodes structured in different types of functional groups and communities. While the majority of approaches for classification of nodes in a complex network has relied on local measurements of the topology/connectivity around each node, valuable information about node functionality can be obtained by concentric (or hierarchical) measurements. This paper extends previous methodologies based on concentric measurements, by studying the possibility of using agglomerative clustering methods, in order to obtain a set of functional groups of nodes, considering particular institutional collaboration network nodes, including various known communities (departments of the University of Sao Paulo). Among the interesting obtained findings, we emphasize the scale-free nature of the network obtained, as well as identification of different patterns of authorship emerging from different areas (e.g. human and exact sciences). Another interesting result concerns the relatively uniform distribution of hubs along concentric levels, contrariwise to the non-uniform pattern found in theoretical scale-free networks such as the BA model. (C) 2008 Elsevier B.V. All rights reserved.