895 resultados para sequence data mining
Resumo:
This document presents a tool able to automatically gather data provided by real energy markets and to generate scenarios, capture and improve market players’ profiles and strategies by using knowledge discovery processes in databases supported by artificial intelligence techniques, data mining algorithms and machine learning methods. It provides the means for generating scenarios with different dimensions and characteristics, ensuring the representation of real and adapted markets, and their participating entities. The scenarios generator module enhances the MASCEM (Multi-Agent Simulator of Competitive Electricity Markets) simulator, endowing a more effective tool for decision support. The achievements from the implementation of the proposed module enables researchers and electricity markets’ participating entities to analyze data, create real scenarios and make experiments with them. On the other hand, applying knowledge discovery techniques to real data also allows the improvement of MASCEM agents’ profiles and strategies resulting in a better representation of real market players’ behavior. This work aims to improve the comprehension of electricity markets and the interactions among the involved entities through adequate multi-agent simulation.
Resumo:
Trabalho de Projeto apresentado como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação
Resumo:
A Internet das Coisas tal como o Big Data e a análise dos dados são dos temas mais discutidos ao querermos observar ou prever as tendências do mercado para as próximas décadas, como o volume económico, financeiro e social, pelo que será relevante perceber a importância destes temas na atualidade. Nesta dissertação será descrita a origem da Internet das Coisas, a sua definição (por vezes confundida com o termo Machine to Machine, redes interligadas de máquinas controladas e monitorizadas remotamente e que possibilitam a troca de dados (Bahga e Madisetti 2014)), o seu ecossistema que envolve a tecnologia, software, dispositivos, aplicações, a infra-estrutura envolvente, e ainda os aspetos relacionados com a segurança, privacidade e modelos de negócios da Internet das Coisas. Pretende-se igualmente explicar cada um dos “Vs” associados ao Big Data: Velocidade, Volume, Variedade e Veracidade, a importância da Business Inteligence e do Data Mining, destacando-se algumas técnicas utilizadas de modo a transformar o volume dos dados em conhecimento para as empresas. Um dos objetivos deste trabalho é a análise das áreas de IoT, modelos de negócio e as implicações do Big Data e da análise de dados como elementos chave para a dinamização do negócio de uma empresa nesta área. O mercado da Internet of Things tem vindo a ganhar dimensão, fruto da Internet e da tecnologia. Devido à importância destes dois recursos e á falta de estudos em Portugal neste campo, com esta dissertação, sustentada na metodologia do “Estudo do Caso”, pretende-se dar a conhecer a experiência portuguesa no mercado da Internet das Coisas. Visa-se assim perceber quais os mecanismos utilizados para trabalhar os dados, a metodologia, sua importância, que consequências trazem para o modelo de negócio e quais as decisões tomadas com base nesses mesmos dados. Este estudo tem ainda como objetivo incentivar empresas portuguesas que estejam neste mercado ou que nele pretendam aceder, a adoptarem estratégias, mecanismos e ferramentas concretas no que diz respeito ao Big Data e análise dos dados.
Resumo:
The interest in using information to improve the quality of living in large urban areas and its governance efficiency has been around for decades. Nevertheless, the improvements in Information and Communications Technology has sparked a new dynamic in academic research, usually under the umbrella term of Smart Cities. This concept of Smart City can probably be translated, in a simplified version, into cities that are lived, managed and developed in an information-saturated environment. While it makes perfect sense and we can easily foresee the benefits of such a concept, presently there are still several significant challenges that need to be tackled before we can materialize this vision. In this work we aim at providing a small contribution in this direction, which maximizes the relevancy of the available information resources. One of the most detailed and geographically relevant information resource available, for the study of cities, is the census, more specifically the data available at block level (Subsecção Estatística). In this work, we use Self-Organizing Maps (SOM) and the variant Geo-SOM to explore the block level data from the Portuguese census of Lisbon city, for the years of 2001 and 2011. We focus on gauging change, proposing ways that allow the comparison of the two time periods, which have two different underlying geographical bases. We proceed with the analysis of the data using different SOM variants, aiming at producing a two-fold portrait: one, of the evolution of Lisbon during the first decade of the XXI century, another, of how the census dataset and SOM’s can be used to produce an informational framework for the study of cities.
Resumo:
telligence applications for the banking industry. Searches were performed in relevant journals resulting in 219 articles published between 2002 and 2013. To analyze such a large number of manuscripts, text mining techniques were used in pursuit for relevant terms on both business intelligence and banking domains. Moreover, the latent Dirichlet allocation modeling was used in or- der to group articles in several relevant topics. The analysis was conducted using a dictionary of terms belonging to both banking and business intelli- gence domains. Such procedure allowed for the identification of relationships between terms and topics grouping articles, enabling to emerge hypotheses regarding research directions. To confirm such hypotheses, relevant articles were collected and scrutinized, allowing to validate the text mining proce- dure. The results show that credit in banking is clearly the main application trend, particularly predicting risk and thus supporting credit approval or de- nial. There is also a relevant interest in bankruptcy and fraud prediction. Customer retention seems to be associated, although weakly, with targeting, justifying bank offers to reduce churn. In addition, a large number of ar- ticles focused more on business intelligence techniques and its applications, using the banking industry just for evaluation, thus, not clearly acclaiming for benefits in the banking business. By identifying these current research topics, this study also highlights opportunities for future research.
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação
Resumo:
Similarity-based operations, similarity join, similarity grouping, data integration
Resumo:
Data mining, frequent pattern mining, database mining, mining algorithms in SQL
Resumo:
Visual data mining, multi-dimensional scaling, POLARMAP, Sammon's mapping, clustering, outlier detection
Resumo:
Magdeburg, Univ., Fak. für Informatik, Diss., 2013
Resumo:
Microarray transcript profiling and RNA interference are two new technologies crucial for large-scale gene function studies in multicellular eukaryotes. Both rely on sequence-specific hybridization between complementary nucleic acid strands, inciting us to create a collection of gene-specific sequence tags (GSTs) representing at least 21,500 Arabidopsis genes and which are compatible with both approaches. The GSTs were carefully selected to ensure that each of them shared no significant similarity with any other region in the Arabidopsis genome. They were synthesized by PCR amplification from genomic DNA. Spotted microarrays fabricated from the GSTs show good dynamic range, specificity, and sensitivity in transcript profiling experiments. The GSTs have also been transferred to bacterial plasmid vectors via recombinational cloning protocols. These cloned GSTs constitute the ideal starting point for a variety of functional approaches, including reverse genetics. We have subcloned GSTs on a large scale into vectors designed for gene silencing in plant cells. We show that in planta expression of GST hairpin RNA results in the expected phenotypes in silenced Arabidopsis lines. These versatile GST resources provide novel and powerful tools for functional genomics.
Resumo:
Cytotoxic T cells (CTL) recognize short peptides that are derived from the proteolysis of endogenous cellular proteins and presented on the cell surface as a complex with MHC class I molecules. CTL can recognize single amino acid substitutions in proteins, including those involved in malignant transformation. The mutated sequence of an oncogene may be presented on the cell surface as a peptide, and thus represents a potential target antigen for tumour therapy. The p21ras gene is mutated in a wide variety of tumours and since the transforming mutations result in amino acid substitutions at positions 12, 13 and 61 of the protein, a limited number of ras peptides could potentially be used in the treatment of a wide variety of malignancies. A common substitution is Val for Gly at position 12 of p21ras. In this study, we show that the peptide sequence from position 5 to position 14 with Val at position 12-ras p5-14 (Val-12)-has a motif which allows it to bind to HLA-A2.1. HLA-A2.1-restricted ras p5-14 (Val-12)-specific CTL were induced in mice transgenic for both HLA-A2.1 and human beta2-microglobulin after in vivo priming with the peptide. The murine CTL could recognize the ras p5-14 (Val-12) peptide when they were presented on both murine and human target cells bearing HLA-A2.1. No cross-reactivity was observed with the native peptide ras p5-14 (Gly-12), and this peptide was not immunogenic in HLA-A2.1 transgenic mice. This represents an interesting model for the study of an HLA-restricted CD8 cytotoxic T cell response to a defined tumour antigen in vivo.
Resumo:
Plant-parasitic nematodes are major agricultural pests worldwide and novel approaches to control them are sorely needed. We report the draft genome sequence of the root-knot nematode Meloidogyne incognita, a biotrophic parasite of many crops, including tomato, cotton and coffee. Most of the assembled sequence of this asexually reproducing nematode, totaling 86 Mb, exists in pairs of homologous but divergent segments. This suggests that ancient allelic regions in M. incognita are evolving toward effective haploidy, permitting new mechanisms of adaptation. The number and diversity of plant cell wall-degrading enzymes in M. incognita is unprecedented in any animal for which a genome sequence is available, and may derive from multiple horizontal gene transfers from bacterial sources. Our results provide insights into the adaptations required by metazoans to successfully parasitize immunocompetent plants, and open the way for discovering new antiparasitic strategies.
Resumo:
BACKGROUND: The availability of the P. falciparum genome has led to novel ways to identify potential vaccine candidates. A new approach for antigen discovery based on the bioinformatic selection of heptad repeat motifs corresponding to alpha-helical coiled coil structures yielded promising results. To elucidate the question about the relationship between the coiled coil motifs and their sequence conservation, we have assessed the extent of polymorphism in putative alpha-helical coiled coil domains in culture strains, in natural populations and in the single nucleotide polymorphism data available at PlasmoDB. METHODOLOGY/PRINCIPAL FINDINGS: 14 alpha-helical coiled coil domains were selected based on preclinical experimental evaluation. They were tested by PCR amplification and sequencing of different P. falciparum culture strains and field isolates. We found that only 3 out of 14 alpha-helical coiled coils showed point mutations and/or length polymorphisms. Based on promising immunological results 5 of these peptides were selected for further analysis. Direct sequencing of field samples from Papua New Guinea and Tanzania showed that 3 out of these 5 peptides were completely conserved. An in silico analysis of polymorphism was performed for all 166 putative alpha-helical coiled coil domains originally identified in the P. falciparum genome. We found that 82% (137/166) of these peptides were conserved, and for one peptide only the detected SNPs decreased substantially the probability score for alpha-helical coiled coil formation. More SNPs were found in arrays of almost perfect tandem repeats. In summary, the coiled coil structure prediction was rarely modified by SNPs. The analysis revealed a number of peptides with strictly conserved alpha-helical coiled coil motifs. CONCLUSION/SIGNIFICANCE: We conclude that the selection of alpha-helical coiled coil structural motifs is a valuable approach to identify potential vaccine targets showing a high degree of conservation.