957 resultados para self-organizing maps
Resumo:
Abstract Background Sugarcane is an increasingly economically and environmentally important C4 grass, used for the production of sugar and bioethanol, a low-carbon emission fuel. Sugarcane originated from crosses of Saccharum species and is noted for its unique capacity to accumulate high amounts of sucrose in its stems. Environmental stresses limit enormously sugarcane productivity worldwide. To investigate transcriptome changes in response to environmental inputs that alter yield we used cDNA microarrays to profile expression of 1,545 genes in plants submitted to drought, phosphate starvation, herbivory and N2-fixing endophytic bacteria. We also investigated the response to phytohormones (abscisic acid and methyl jasmonate). The arrayed elements correspond mostly to genes involved in signal transduction, hormone biosynthesis, transcription factors, novel genes and genes corresponding to unknown proteins. Results Adopting an outliers searching method 179 genes with strikingly different expression levels were identified as differentially expressed in at least one of the treatments analysed. Self Organizing Maps were used to cluster the expression profiles of 695 genes that showed a highly correlated expression pattern among replicates. The expression data for 22 genes was evaluated for 36 experimental data points by quantitative RT-PCR indicating a validation rate of 80.5% using three biological experimental replicates. The SUCAST Database was created that provides public access to the data described in this work, linked to tissue expression profiling and the SUCAST gene category and sequence analysis. The SUCAST database also includes a categorization of the sugarcane kinome based on a phylogenetic grouping that included 182 undefined kinases. Conclusion An extensive study on the sugarcane transcriptome was performed. Sugarcane genes responsive to phytohormones and to challenges sugarcane commonly deals with in the field were identified. Additionally, the protein kinases were annotated based on a phylogenetic approach. The experimental design and statistical analysis applied proved robust to unravel genes associated with a diverse array of conditions attributing novel functions to previously unknown or undefined genes. The data consolidated in the SUCAST database resource can guide further studies and be useful for the development of improved sugarcane varieties.
Resumo:
[ES]El spam, o correo no deseado enviado masivamente, es una amenaza que afecta al correo electrónico y otros medios de comunicación telemática. Su alto volumen de circulación genera pérdidas temporales y económicas considerables. Se presenta una solución a este problema: un sistema inteligente híbrido de filtrado antispam, basado en redes neuronales artificiales (RNA) no supervisadas. Consta de una etapa de preprocesado y de otra de procesado, basadas en distintos modelos de computación: programada (con 2 fases: manual y computacional) y neuronal (mediante mapas autoorganizados de Kohonen, SOM), respectivamente. Este sistema ha sido optimizado usando, como cuerpo de datos, ham de “Enron Email” y spam de dos fuentes diferentes. Se analiza la calidad y el rendimiento del mismo mediante diferentes métricas.
Resumo:
Gamma-radiation exposure has both short- and long-term adverse health effects. The threat of modern terrorism places human populations at risk for radiological exposures, yet current medical countermeasures to radiation exposure are limited. Here we describe metabolomics for gamma-radiation biodosimetry in a mouse model. Mice were gamma-irradiated at doses of 0, 3 and 8 Gy (2.57 Gy/min), and urine samples collected over the first 24 h after exposure were analyzed by ultra-performance liquid chromatography-time-of-flight mass spectrometry (UPLC-TOFMS). Multivariate data were analyzed by orthogonal partial least squares (OPLS). Both 3- and 8-Gy exposures yielded distinct urine metabolomic phenotypes. The top 22 ions for 3 and 8 Gy were analyzed further, including tandem mass spectrometric comparison with authentic standards, revealing that N-hexanoylglycine and beta-thymidine are urinary biomarkers of exposure to 3 and 8 Gy, 3-hydroxy-2-methylbenzoic acid 3-O-sulfate is elevated in urine of mice exposed to 3 but not 8 Gy, and taurine is elevated after 8 but not 3 Gy. Gene Expression Dynamics Inspector (GEDI) self-organizing maps showed clear dose-response relationships for subsets of the urine metabolome. This approach is useful for identifying mice exposed to gamma radiation and for developing metabolomic strategies for noninvasive radiation biodosimetry in humans.
Resumo:
Salamanca is cataloged as one of the most polluted cities in Mexico. In order to observe the behavior and clarify the influence of wind parameters on the Sulphur Dioxide (SO2) concentrations a Self-Organizing Maps (SOM) Neural Network have been implemented at three monitoring locations for the period from January 1 to December 31, 2006. The maximum and minimum daily values of SO2 concentrations measured during the year of 2006 were correlated with the wind parameters of the same period. The main advantages of the SOM Neural Network is that it allows to integrate data from different sensors and provide readily interpretation results. Especially, it is powerful mapping and classification tool, which others information in an easier way and facilitates the task of establishing an order of priority between the distinguished groups of concentrations depending on their need for further research or remediation actions in subsequent management steps. For each monitoring location, SOM classifications were evaluated with respect to pollution levels established by Health Authorities. The classification system can help to establish a better air quality monitoring methodology that is essential for assessing the effectiveness of imposed pollution controls, strategies, and facilitate the pollutants reduction.
Resumo:
The area of Human-Machine Interface is growing fast due to its high importance in all technological systems. The basic idea behind designing human-machine interfaces is to enrich the communication with the technology in a natural and easy way. Gesture interfaces are a good example of transparent interfaces. Such interfaces must identify properly the action the user wants to perform, so the proper gesture recognition is of the highest importance. However, most of the systems based on gesture recognition use complex methods requiring high-resource devices. In this work, we propose to model gestures capturing their temporal properties, which significantly reduce storage requirements, and use clustering techniques, namely self-organizing maps and unsupervised genetic algorithm, for their classification. We further propose to train a certain number of algorithms with different parameters and combine their decision using majority voting in order to decrease the false positive rate. The main advantage of the approach is its simplicity, which enables the implementation using devices with limited resources, and therefore low cost. The testing results demonstrate its high potential.
Resumo:
Over the last ten years, Salamanca has been considered among the most polluted cities in México. This paper presents a Self-Organizing Maps (SOM) Neural Network application to classify pollution data and automatize the air pollution level determination for Sulphur Dioxide (SO2) in Salamanca. Meteorological parameters are well known to be important factors contributing to air quality estimation and prediction. In order to observe the behavior and clarify the influence of wind parameters on the SO2 concentrations a SOM Neural Network have been implemented along a year. The main advantages of the SOM is that it allows to integrate data from different sensors and provide readily interpretation results. Especially, it is powerful mapping and classification tool, which others information in an easier way and facilitates the task of establishing an order of priority between the distinguished groups of concentrations depending on their need for further research or remediation actions in subsequent management steps. The results show a significative correlation between pollutant concentrations and some environmental variables.
Resumo:
Providing security to the emerging field of ambient intelligence will be difficult if we rely only on existing techniques, given their dynamic and heterogeneous nature. Moreover, security demands of these systems are expected to grow, as many applications will require accurate context modeling. In this work we propose an enhancement to the reputation systems traditionally deployed for securing these systems. Different anomaly detectors are combined using the immunological paradigm to optimize reputation system performance in response to evolving security requirements. As an example, the experiments show how a combination of detectors based on unsupervised techniques (self-organizing maps and genetic algorithms) can help to significantly reduce the global response time of the reputation system. The proposed solution offers many benefits: scalability, fast response to adversarial activities, ability to detect unknown attacks, high adaptability, and high ability in detecting and confining attacks. For these reasons, we believe that our solution is capable of coping with the dynamism of ambient intelligence systems and the growing requirements of security demands.
Resumo:
We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and self-organizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers. We test several SVMs that use different similarity metrics, as well as some other supervised learning methods, and find that the SVMs best identify sets of genes with a common function using expression data. Finally, we use SVMs to predict functional roles for uncharacterized yeast ORFs based on their expression data.
Resumo:
O escoamento bifásico de gás-líquido é encontrado em muitos circuitos fechados que utilizam circulação natural para fins de resfriamento. O fenômeno da circulação natural é importante nos recentes projetos de centrais nucleares para a remoção de calor. O circuito de circulação natural (Circuito de Circulação Natural - CCN), instalado no Instituto de Pesquisas Energéticas e Nucleares, IPEN / CNEN, é um circuito experimento concebido para fornecer dados termo-hidráulicos relacionados com escoamento monofásico ou bifásico em condições de circulação natural. A estimativa de transferência de calor tem sido melhorada com base em modelos que requerem uma previsão precisa de transições de padrão de escoamento. Este trabalho apresenta testes experimentais desenvolvidos no CCN para a visualização dos fenômenos de instabilidade em ciclos de circulação natural básica e classificar os padrões de escoamento bifásico associados aos transientes e instabilidades estáticas de escoamento. As imagens são comparadas e agrupadas utilizando mapas auto-organizáveis de Kohonen (SOM), aplicados em diferentes características da imagem digital. Coeficientes da Transformada Discreta de Cossenos de Quadro Completo (FFDCT) foram utilizados como entrada para a tarefa de classificação, levando a bons resultados. Os protótipos de FFDCT obtidos podem ser associados a cada padrão de escoamento possibilitando uma melhor compreensão da instabilidade observada. Uma metodologia sistemática foi utilizada para verificar a robustez do método.
Resumo:
In this paper we propose a neural network model to simplify and 2D meshes. This model is based on the Growing Neural Gas model and is able to simplify any mesh with different topologies and sizes. A triangulation process is included with the objective to reconstruct the mesh. This model is applied to some problems related to urban networks.
Resumo:
In this work, a modified version of the elastic bunch graph matching (EBGM) algorithm for face recognition is introduced. First, faces are detected by using a fuzzy skin detector based on the RGB color space. Then, the fiducial points for the facial graph are extracted automatically by adjusting a grid of points to the result of an edge detector. After that, the position of the nodes, their relation with their neighbors and their Gabor jets are calculated in order to obtain the feature vector defining each face. A self-organizing map (SOM) framework is shown afterwards. Thus, the calculation of the winning neuron and the recognition process are performed by using a similarity function that takes into account both the geometric and texture information of the facial graph. The set of experiments carried out for our SOM-EBGM method shows the accuracy of our proposal when compared with other state-of the-art methods.
Resumo:
Multidimensional compound optimization is a new paradigm in the drug discovery process, yielding efficiencies during early stages and reducing attrition in the later stages of drug development. The success of this strategy relies heavily on understanding this multidimensional data and extracting useful information from it. This paper demonstrates how principled visualization algorithms can be used to understand and explore a large data set created in the early stages of drug discovery. The experiments presented are performed on a real-world data set comprising biological activity data and some whole-molecular physicochemical properties. Data visualization is a popular way of presenting complex data in a simpler form. We have applied powerful principled visualization methods, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), to help the domain experts (screening scientists, chemists, biologists, etc.) understand and draw meaningful decisions. We also benchmark these principled methods against relatively better known visualization approaches, principal component analysis (PCA), Sammon's mapping, and self-organizing maps (SOMs), to demonstrate their enhanced power to help the user visualize the large multidimensional data sets one has to deal with during the early stages of the drug discovery process. The results reported clearly show that the GTM and HGTM algorithms allow the user to cluster active compounds for different targets and understand them better than the benchmarks. An interactive software tool supporting these visualization algorithms was provided to the domain experts. The tool facilitates the domain experts by exploration of the projection obtained from the visualization algorithms providing facilities such as parallel coordinate plots, magnification factors, directional curvatures, and integration with industry standard software. © 2006 American Chemical Society.
Resumo:
Magnification factors specify the extent to which the area of a small patch of the latent (or `feature') space of a topographic mapping is magnified on projection to the data space, and are of considerable interest in both neuro-biological and data analysis contexts. Previous attempts to consider magnification factors for the self-organizing map (SOM) algorithm have been hindered because the mapping is only defined at discrete points (given by the reference vectors). In this paper we consider the batch version of SOM, for which a continuous mapping can be defined, as well as the Generative Topographic Mapping (GTM) algorithm of Bishop et al. (1997) which has been introduced as a probabilistic formulation of the SOM. We show how the techniques of differential geometry can be used to determine magnification factors as continuous functions of the latent space coordinates. The results are illustrated here using a problem involving the identification of crab species from morphological data.
Resumo:
Data visualization algorithms and feature selection techniques are both widely used in bioinformatics but as distinct analytical approaches. Until now there has been no method of measuring feature saliency while training a data visualization model. We derive a generative topographic mapping (GTM) based data visualization approach which estimates feature saliency simultaneously with the training of the visualization model. The approach not only provides a better projection by modeling irrelevant features with a separate noise model but also gives feature saliency values which help the user to assess the significance of each feature. We compare the quality of projection obtained using the new approach with the projections from traditional GTM and self-organizing maps (SOM) algorithms. The results obtained on a synthetic and a real-life chemoinformatics dataset demonstrate that the proposed approach successfully identifies feature significance and provides coherent (compact) projections. © 2006 IEEE.