6 resultados para Information visualization

em Instituto Politécnico do Porto, Portugal


Relevância:

40.00% 40.00%

Publicador:

Resumo:

We describe a novel approach to explore DNA nucleotide sequence data, aiming to produce high-level categorical and structural information about the underlying chromosomes, genomes and species. The article starts by analyzing chromosomal data through histograms using fixed length DNA sequences. After creating the DNA-related histograms, a correlation between pairs of histograms is computed, producing a global correlation matrix. These data are then used as input to several data processing methods for information extraction and tabular/graphical output generation. A set of 18 species is processed and the extensive results reveal that the proposed method is able to generate significant and diversified outputs, in good accordance with current scientific knowledge in domains such as genomics and phylogenetics.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper analyzes the DNA code of several species in the perspective of information content. For that purpose several concepts and mathematical tools are selected towards establishing a quantitative method without a priori distorting the alphabet represented by the sequence of DNA bases. The synergies of associating Gray code, histogram characterization and multidimensional scaling visualization lead to a collection of plots with a categorical representation of species and chromosomes.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Seismic data is difficult to analyze and classical mathematical tools reveal strong limitations in exposing hidden relationships between earthquakes. In this paper, we study earthquake phenomena in the perspective of complex systems. Global seismic data, covering the period from 1962 up to 2011 is analyzed. The events, characterized by their magnitude, geographic location and time of occurrence, are divided into groups, either according to the Flinn-Engdahl (F-E) seismic regions of Earth or using a rectangular grid based in latitude and longitude coordinates. Two methods of analysis are considered and compared in this study. In a first method, the distributions of magnitudes are approximated by Gutenberg-Richter (G-R) distributions and the parameters used to reveal the relationships among regions. In the second method, the mutual information is calculated and adopted as a measure of similarity between regions. In both cases, using clustering analysis, visualization maps are generated, providing an intuitive and useful representation of the complex relationships that are present among seismic data. Such relationships might not be perceived on classical geographic maps. Therefore, the generated charts are a valid alternative to other visualization tools, for understanding the global behavior of earthquakes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper analyses forest fires in the perspective of dynamical systems. Forest fires exhibit complex correlations in size, space and time, revealing features often present in complex systems, such as the absence of a characteristic length-scale, or the emergence of long range correlations and persistent memory. This study addresses a public domain forest fires catalogue, containing information of events for Portugal, during the period from 1980 up to 2012. The data is analysed in an annual basis, modelling the occurrences as sequences of Dirac impulses with amplitude proportional to the burnt area. First, we consider mutual information to correlate annual patterns. We use visualization trees, generated by hierarchical clustering algorithms, in order to compare and to extract relationships among the data. Second, we adopt the Multidimensional Scaling (MDS) visualization tool. MDS generates maps where each object corresponds to a point. Objects that are perceived to be similar to each other are placed on the map forming clusters. The results are analysed in order to extract relationships among the data and to identify forest fire patterns.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper analyses forest fires in the perspective of dynamical systems. Forest fires exhibit complex correlations in size, space and time, revealing features often present in complex systems, such as the absence of a characteristic length-scale, or the emergence of long range correlations and persistent memory. This study addresses a public domain forest fires catalogue, containing information of events for Portugal, during the period from 1980 up to 2012. The data is analysed in an annual basis, modelling the occurrences as sequences of Dirac impulses with amplitude proportional to the burnt area. First, we consider mutual information to correlate annual patterns. We use visualization trees, generated by hierarchical clustering algorithms, in order to compare and to extract relationships among the data. Second, we adopt the Multidimensional Scaling (MDS) visualization tool. MDS generates maps where each object corresponds to a point. Objects that are perceived to be similar to each other are placed on the map forming clusters. The results are analysed in order to extract relationships among the data and to identify forest fire patterns.