830 resultados para Computing Classification Systems
Resumo:
Frequent pattern discovery in structured data is receiving an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset.
Resumo:
Identification of Fusarium species has always been difficult due to confusing phenotypic classification systems. We have developed a fluorescent-based polymerase chain reaction assay that allows for rapid and reliable identification of five toxigenic and pathogenic Fusarium species. The species includes Fusarium avenaceum, F. culmorum, F. equiseti, F. oxysporum and F. sambucinum. The method is based on the PCR amplification of species-specific DNA fragments using fluorescent oligonucleotide primers, which were designed based on sequence divergence within the internal transcribed spacer region of nuclear ribosomal DNA. Besides providing an accurate, reliable, and quick diagnosis of these Fusaria, another advantage with this method is that it reduces the potential for exposure to carcinogenic chemicals as it substitutes the use of fluorescent dyes in place of ethidium, bromide. Apart from its multidisciplinary importance and usefulness, it also obviates the need for gel electrophoresis. (C) 2002 Published by Elsevier Science B.V. on behalf of the Federation of European Microbiological Societies.
Resumo:
This paper presents an approach for automatic classification of pulsed Terahertz (THz), or T-ray, signals highlighting their potential in biomedical, pharmaceutical and security applications. T-ray classification systems supply a wealth of information about test samples and make possible the discrimination of heterogeneous layers within an object. In this paper, a novel technique involving the use of Auto Regressive (AR) and Auto Regressive Moving Average (ARMA) models on the wavelet transforms of measured T-ray pulse data is presented. Two example applications are examined - the classi. cation of normal human bone (NHB) osteoblasts against human osteosarcoma (HOS) cells and the identification of six different powder samples. A variety of model types and orders are used to generate descriptive features for subsequent classification. Wavelet-based de-noising with soft threshold shrinkage is applied to the measured T-ray signals prior to modeling. For classi. cation, a simple Mahalanobis distance classi. er is used. After feature extraction, classi. cation accuracy for cancerous and normal cell types is 93%, whereas for powders, it is 98%.
Resumo:
Uma classificação adequada de fundos é importante para que o investidor possa organizar a informação disponível de tal modo que possa tomar decisões de aplicação de seus recursos de forma eficiente. No Brasil, existem dois sistemas de classificação amplamente utilizados, o CVM e o ANBIMA, porem ambos possuem categorias com fronteiras subjetivas, isto é, possuem um elevado grau de arbitrariedade na definição de suas categorias, este fato prejudica uma alocação eficiente por parte do investidor. Fundos multimercado são fundos que possuem política de investimento que envolvem vários fatores de risco sem concentração em nenhum fator especial, diferentemente das outras classes de fundos do mercado brasileiro. Sob este aspecto, uma categorização adequada dos fundos multimercados traria inúmeros benefícios tais como a redução do custo de análise, a maior facilidade no processo de tomada de decisão de investimento, uma diversificação mais eficiente, clareza na comparação de desempenho e o melhor entendimento dos riscos incorridos dentre outros benefícios. O presente trabalho tem como objetivo, utilizando-se da já consagrada técnica de análise de estilo de Sharpe (1992), decompor a exposição de cada fundo em seus principais fatores de risco, após isto, utilizar-se da análise de cluster para agrupar os fundos de forma coerente a suas exposições, tentando assim fazer um classificação mais eficiente; isto seria um contraponto a classificação mais utilizada pelo mercado brasileiro, a classificação Anbima, que se baseia no regulamento do fundo, isto é, no que o fundo “pode” investir, e não no que o fundo efetivamente investe.
Resumo:
In this paper artificial neural network (ANN) based on supervised and unsupervised algorithms were investigated for use in the study of rheological parameters of solid pharmaceutical excipients, in order to develop computational tools for manufacturing solid dosage forms. Among four supervised neural networks investigated, the best learning performance was achieved by a feedfoward multilayer perceptron whose architectures was composed by eight neurons in the input layer, sixteen neurons in the hidden layer and one neuron in the output layer. Learning and predictive performance relative to repose angle was poor while to Carr index and Hausner ratio (CI and HR, respectively) showed very good fitting capacity and learning, therefore HR and CI were considered suitable descriptors for the next stage of development of supervised ANNs. Clustering capacity was evaluated for five unsupervised strategies. Network based on purely unsupervised competitive strategies, classic "Winner-Take-All", "Frequency-Sensitive Competitive Learning" and "Rival-Penalize Competitive Learning" (WTA, FSCL and RPCL, respectively) were able to perform clustering from database, however this classification was very poor, showing severe classification errors by grouping data with conflicting properties into the same cluster or even the same neuron. On the other hand it could not be established what was the criteria adopted by the neural network for those clustering. Self-Organizing Maps (SOM) and Neural Gas (NG) networks showed better clustering capacity. Both have recognized the two major groupings of data corresponding to lactose (LAC) and cellulose (CEL). However, SOM showed some errors in classify data from minority excipients, magnesium stearate (EMG) , talc (TLC) and attapulgite (ATP). NG network in turn performed a very consistent classification of data and solve the misclassification of SOM, being the most appropriate network for classifying data of the study. The use of NG network in pharmaceutical technology was still unpublished. NG therefore has great potential for use in the development of software for use in automated classification systems of pharmaceutical powders and as a new tool for mining and clustering data in drug development
Resumo:
Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated
Resumo:
Specificity and updating of the bibliographic classification systems can be considered a determinant factor to the quality of organization and representation of the legal documentation. In the specific case of Brazil, the Brazilian Law Decimal Classification, does not foresee specific subdivisions for Labor Law procedures. In this sense, it carries out a terminological work based on table of contents of doctrinal Labor Law books of the mentioned area, which are compared to the conceptual structure of the Brazilian Law Decimal Classification. As a result, it presents an extension proposal for Labor Procedures as well as a methodological background for further extensions and updates.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
The mapping of the land use, vegetation and environmental impacts using remote sensing ana geoprocessmg allow detection, spatial representation and quantification of the alterations caused by the human action on the nature, contributing to the monitoring and planning of those activities that may cause damages to the environment. This study apply methodologies based on digital processing of orbital images for the mapping of the land use, vegetation and anthropic activities that cause impacts in the environment. It was considered a test area in the district of Assistência and surroundings, in Rio Claro (SP) region. The methodology proposed was checked through the crossing of maps in the software GIS - Idrisi. These maps either obtained with conventional interpretation of aerial photos of 1995, digitized in the software CAD Overlay and geo-referenced in the AutoCAD Map, or with the application of digital classification systems on SPOT-XS and PAN orbital images of 1995, followed by field observations. The crossing of conventional and digital maps of a same area with the CIS allows to verify the overall results obtained through the computational handling of orbital images. With the use of digital processing techniques, specially multiespectral classification, it is possible to detect automatically and visually the impacts related to the mineral extraction, as well as to survey the land use, vegetation and environmental impacts.
Resumo:
Despite being known to science for quite a long time, the phenomenon of seed dormancy still baffles the scientific community for the multiple complex underlying mechanisms. The current classification systems of seed dormancy attempt to condense all that is known about the phenomenon in an attempt to generate a conceptual database that would enable facilitated interpretation of upcoming information and allow for a better contextuation of research in this field. The present paper is a preliminary overview of the current panorama of concepts and classification systems of seed dormancy that intends to serve as a standpoint for future research in this field.
Resumo:
Pattern recognition in large amount of data has been paramount in the last decade, since that is not straightforward to design interactive and real time classification systems. Very recently, the Optimum-Path Forest classifier was proposed to overcome such limitations, together with its training set pruning algorithm, which requires a parameter that has been empirically set up to date. In this paper, we propose a Harmony Search-based algorithm that can find near optimal values for that. The experimental results have showed that our algorithm is able to find proper values for the OPF pruning algorithm parameter. © 2011 IEEE.
Resumo:
Objective: To evaluate the correlations between clinical-radiographical aspects and histomorphometric-molecular parameters of endosseous dental implant sites in humans. Material and methods: The study sample consisted of bone implant sites from the jawbones of 32 volunteers, which were classified according to two different systems: (1) based only on periapical and panoramic images (PP); (2) as proposed by Lekholm & Zarb (L&Z). Bone biopsies were removed using trephine during the first drilling for implant placement. Samples were stained with haematoxylin-eosin (HE), and histomorphometric analysis was performed to obtain the following parameters: trabecular thickness (Tb.Th), trabecular number, bone volume density (BV/TV), bone specific surface (BS/BV), bone surface density and trabecular separation (Tb.Sp). In addition, immunohistochemistry analysis was performed on bone tissue samples for the proteins, Receptor activator of nuclear factor kappa-B (RANK), RANK ligand (RANKL), osteoprotegerin (OPG) and Osteocalcin (OC). Also, the determination of the relative levels of gene expression was performed using Reverse transcription-real-time Polymerase Chain Reaction (RT-PCR). Results: PP and L&Z classification systems revealed a moderate correlation with BV/TV, BS/BV, Tb.Th and Tb.Sp. L&Z's system identified differences among bone types when BV/TV, BS/BV, Tb.Th and Tb.Sp were compared. A weak correlation between PP/L&Z classifications and the expression of bone metabolism regulators (RANK, RANKL, OPG e OC) was found. The analysis of mRNA expression showed no difference between the bone types evaluated. Conclusions: Our results suggest that PP and L&Z subjective bone-type classification systems are related to histomorphometric aspects. These data may contribute to the validation of these classifications. Bone remodelling regulatory molecules do not seem to influence morphological aspects of the jawbone © 2011 John Wiley & Sons A/S.