2 resultados para Word’s power
em Instituto Politécnico do Porto, Portugal
Resumo:
Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.
Resumo:
Power laws, also known as Pareto-like laws or Zipf-like laws, are commonly used to explain a variety of real world distinct phenomena, often described merely by the produced signals. In this paper, we study twelve cases, namely worldwide technological accidents, the annual revenue of America׳s largest private companies, the number of inhabitants in America׳s largest cities, the magnitude of earthquakes with minimum moment magnitude equal to 4, the total burned area in forest fires occurred in Portugal, the net worth of the richer people in America, the frequency of occurrence of words in the novel Ulysses, by James Joyce, the total number of deaths in worldwide terrorist attacks, the number of linking root domains of the top internet domains, the number of linking root domains of the top internet pages, the total number of human victims of tornadoes occurred in the U.S., and the number of inhabitants in the 60 most populated countries. The results demonstrate the emergence of statistical characteristics, very close to a power law behavior. Furthermore, the parametric characterization reveals complex relationships present at higher level of description.