856 resultados para sequence data mining


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most research on technology roadmapping has focused on its practical applications and the development of methods to enhance its operational process. Thus, despite a demand for well-supported, systematic information, little attention has been paid to how/which information can be utilised in technology roadmapping. Therefore, this paper aims at proposing a methodology to structure technological information in order to facilitate the process. To this end, eight methods are suggested to provide useful information for technology roadmapping: summary, information extraction, clustering, mapping, navigation, linking, indicators and comparison. This research identifies the characteristics of significant data that can potentially be used in roadmapping, and presents an approach to extracting important information from such raw data through various data mining techniques including text mining, multi-dimensional scaling and K-means clustering. In addition, this paper explains how this approach can be applied in each step of roadmapping. The proposed approach is applied to develop a roadmap of radio-frequency identification (RFID) technology to illustrate the process practically. © 2013 © 2013 Taylor & Francis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To clarify cuttlefish phylogeny, mitochondrial cytochrome c oxidase subunit 1 (COI) gene and partial 16S rRNA gene are sequenced for 13 cephalopod species. Phylogenetic trees are constructed, with the neighbor-joining method. Coleoids are divided into two main lineages, Decabrachia and Octobrachia. The monophyly of the order Sepioidea, which includes the families Sepiidae, Sepiolidae and Idiosepiidae, is not supported. From the two families of Sepioidea examined, the Sepiolidae are polyphyletic and are excluded from the order. On the basis of 16S rRNA and amino acid of COI gene sequences data, the two genera (Sepiella and Sepia) from the Sepiidae can be distinguished, but do not have a visible boundary using COI gene sequences. The reason is explained. This suggests that the 16S rDNA of cephalopods is a precious tool to analyze taxonomic relationships at the genus level, and COI gene is fitter at a higher taxonomic level (i.e., family).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ulvacean green seaweeds are common worldwide; they formed massive green tides in the Yellow Sea in recent years, which caused marine ecological problems as well as a social issue. We investigated two major genera of the Ulvaceae, Ulva and Enteromorpha, and collected the plastid rbcL and nuclear ITS sequences of specimens of the genera in two sides of the Yellow Sea and analyzed them. Phylogenetic trees of rbcL data show the occurrence of five species of Enteromorpha (E. compressa, E. flexuosa, E. intestinalis, E. linza and E. prolifera) and three species of Ulva (U. pertusa, U. rigida and U. ohnoi). However, we found U. ohnoi, which is known as a subtropical to tropical species, at two sites on Jeju Island, Korea. Four ribotypes in partial sequences of 5.8S rDNA and ITS2 from E. compressa were also found. Ribotype network analysis revealed that the common ribotype, occurring in China, Korea and Europe, is connected with ribotypes from Europe and China/Japan. Although samples of the same species were collected from both sides of the Yellow Sea, intraspecific genetic polymorphism of each species was low among samples collected worldwide.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clare, A. and King R.D. (2003) Data mining the yeast genome in a lazy functional language. In Practical Aspects of Declarative Languages (PADL'03) (won Best/Most Practical Paper award).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last decade, data mining has emerged as one of the most dynamic and lively areas in information technology. Although many algorithms and techniques for data mining have been proposed, they either focus on domain independent techniques or on very specific domain problems. A general requirement in bridging the gap between academia and business is to cater to general domain-related issues surrounding real-life applications, such as constraints, organizational factors, domain expert knowledge, domain adaption, and operational knowledge. Unfortunately, these either have not been addressed, or have not been sufficiently addressed, in current data mining research and development.Domain-Driven Data Mining (D3M) aims to develop general principles, methodologies, and techniques for modeling and merging comprehensive domain-related factors and synthesized ubiquitous intelligence surrounding problem domains with the data mining process, and discovering knowledge to support business decision-making. This paper aims to report original, cutting-edge, and state-of-the-art progress in D3M. It covers theoretical and applied contributions aiming to: 1) propose next-generation data mining frameworks and processes for actionable knowledge discovery, 2) investigate effective (automated, human and machine-centered and/or human-machined-co-operated) principles and approaches for acquiring, representing, modelling, and engaging ubiquitous intelligence in real-world data mining, and 3) develop workable and operational systems balancing technical significance and applications concerns, and converting and delivering actionable knowledge into operational applications rules to seamlessly engage application processes and systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background. The assembly of the tree of life has seen significant progress in recent years but algae and protists have been largely overlooked in this effort. Many groups of algae and protists have ancient roots and it is unclear how much data will be required to resolve their phylogenetic relationships for incorporation in the tree of life. The red algae, a group of primary photosynthetic eukaryotes of more than a billion years old, provide the earliest fossil evidence for eukaryotic multicellularity and sexual reproduction. Despite this evolutionary significance, their phylogenetic relationships are understudied. This study aims to infer a comprehensive red algal tree of life at the family level from a supermatrix containing data mined from GenBank. We aim to locate remaining regions of low support in the topology, evaluate their causes and estimate the amount of data required to resolve them. Results. Phylogenetic analysis of a supermatrix of 14 loci and 98 red algal families yielded the most complete red algal tree of life to date. Visualization of statistical support showed the presence of five poorly supported regions. Causes for low support were identified with statistics about the age of the region, data availability and node density, showing that poor support has different origins in different parts of the tree. Parametric simulation experiments yielded optimistic estimates of how much data will be needed to resolve the poorly supported regions (ca. 103 to ca. 104 nucleotides for the different regions). Nonparametric simulations gave a markedly more pessimistic image, some regions requiring more than 2.8 105 nucleotides or not achieving the desired level of support at all. The discrepancies between parametric and nonparametric simulations are discussed in light of our dataset and known attributes of both approaches. Conclusions. Our study takes the red algae one step closer to meaningful inclusion in the tree of life. In addition to the recovery of stable relationships, the recognition of five regions in need of further study is a significant outcome of this work. Based on our analyses of current availability and future requirements of data, we make clear recommendations for forthcoming research.