231 resultados para Complete K-ary Tree
em Queensland University of Technology - ePrints Archive
Resumo:
A novel m-ary tree based approach is presented to solve asset management decisions which are combinatorial in nature. The approach introduces a new dynamic constraint based control mechanism which is capable of excluding infeasible solutions from the solution space. The approach also provides a solution to the challenges with ordering of assets decisions.
Resumo:
We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.
Resumo:
This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.
Resumo:
Random Indexing K-tree is the combination of two algorithms suited for large scale document clustering.
Resumo:
Perez-Losada et al. [1] analyzed 72 complete genomes corresponding to nine mammalian (67 strains) and 2 avian (5 strains) polyomavirus species using maximum likelihood and Bayesian methods of phylogenetic inference. Because some data of 2 genomes in their work are now not available in GenBank, in this work, we analyze the phylogenetic relationship of the remaining 70 complete genomes corresponding to nine mammalian (65 strains) and two avian (5 strains) polyomavirus species using a dynamical language model approach developed by our group (Yu et al., [26]). This distance method does not require sequence alignment for deriving species phylogeny based on overall similarities of the complete genomes. Our best tree separates the bird polyomaviruses (avian polyomaviruses and goose hemorrhagic polymaviruses) from the mammalian polyomaviruses, which supports the idea of splitting the genus into two subgenera. Such a split is consistent with the different viral life strategies of each group. In the mammalian polyomavirus subgenera, mouse polyomaviruses (MPV), simian viruses 40 (SV40), BK viruses (BKV) and JC viruses (JCV) are grouped as different branches as expected. The topology of our best tree is quite similar to that of the tree constructed by Perez-Losada et al.
Resumo:
This paper describes the approach taken to the clustering task at INEX 2009 by a group at the Queensland University of Technology. The Random Indexing (RI) K-tree has been used with a representation that is based on the semantic markup available in the INEX 2009 Wikipedia collection. The RI K-tree is a scalable approach to clustering large document collections. This approach has produced quality clustering when evaluated using two different methodologies.
Resumo:
Digital collections are growing exponentially in size as the information age takes a firm grip on all aspects of society. As a result Information Retrieval (IR) has become an increasingly important area of research. It promises to provide new and more effective ways for users to find information relevant to their search intentions. Document clustering is one of the many tools in the IR toolbox and is far from being perfected. It groups documents that share common features. This grouping allows a user to quickly identify relevant information. If these groups are misleading then valuable information can accidentally be ignored. There- fore, the study and analysis of the quality of document clustering is important. With more and more digital information available, the performance of these algorithms is also of interest. An algorithm with a time complexity of O(n2) can quickly become impractical when clustering a corpus containing millions of documents. Therefore, the investigation of algorithms and data structures to perform clustering in an efficient manner is vital to its success as an IR tool. Document classification is another tool frequently used in the IR field. It predicts categories of new documents based on an existing database of (doc- ument, category) pairs. Support Vector Machines (SVM) have been found to be effective when classifying text documents. As the algorithms for classifica- tion are both efficient and of high quality, the largest gains can be made from improvements to representation. Document representations are vital for both clustering and classification. Representations exploit the content and structure of documents. Dimensionality reduction can improve the effectiveness of existing representations in terms of quality and run-time performance. Research into these areas is another way to improve the efficiency and quality of clustering and classification results. Evaluating document clustering is a difficult task. Intrinsic measures of quality such as distortion only indicate how well an algorithm minimised a sim- ilarity function in a particular vector space. Intrinsic comparisons are inherently limited by the given representation and are not comparable between different representations. Extrinsic measures of quality compare a clustering solution to a “ground truth” solution. This allows comparison between different approaches. As the “ground truth” is created by humans it can suffer from the fact that not every human interprets a topic in the same manner. Whether a document belongs to a particular topic or not can be subjective.
Resumo:
In this paper we present pyktree, an implementation of the K-tree algorithm in the Python programming language. The K-tree algorithm provides highly balanced search trees for vector quantization that scales up to very large data sets. Pyktree is highly modular and well suited for rapid-prototyping of novel distance measures and centroid representations. It is easy to install and provides a python package for library use as well as command line tools.
Resumo:
High reliability of railway power systems is one of the essential criteria to ensure quality and cost-effectiveness of railway services. Evaluation of reliability at system level is essential for not only scheduling maintenance activities, but also identifying reliability-critical components. Various methods to compute reliability on individual components or regularly structured systems have been developed and proven to be effective. However, they are not adequate for evaluating complicated systems with numerous interconnected components, such as railway power systems, and locating the reliability critical components. Fault tree analysis (FTA) integrates the reliability of individual components into the overall system reliability through quantitative evaluation and identifies the critical components by minimum cut sets and sensitivity analysis. The paper presents the reliability evaluation of railway power systems by FTA and investigates the impact of maintenance activities on overall reliability. The applicability of the proposed methods is illustrated by case studies in AC railways.
Resumo:
Fault tree analysis (FTA) is presented to model the reliability of a railway traction power system in this paper. First, the construction of fault tree is introduced to integrate components in traction power systems into a fault tree; then the binary decision diagram (BDD) method is used to evaluate fault trees qualitatively and quantitatively. The components contributing to the reliability of overall system are identified with their relative importance through sensitivity analysis. Finally, an AC traction power system is evaluated by the proposed methods.
Resumo:
The monogeneric family Fergusoninidae consists of gall-forming flies that, together with Fergusobia (Tylenchida: Neotylenchidae) nematodes, form the only known mutualistic association between insects and nematodes. In this study, the entire 16,000 bp mitochondrial genome of Fergusonina taylori Nelson and Yeates was sequenced. The circular genome contains one encoding region including 27 genes and one non-coding A þT-rich region. The arrangement of the proteincoding, ribosomal RNA (rRNA) and transfer RNA (tRNA) genes was the same as that found in the ancestral insect. Nucleotide composition is highly A þ T biased. All of the protein initiation codons are ATN, except for nad1 which begins with TTT. All 22 tRNA anticodons of F. taylori match those observed in Drosophila yakuba, and all form the typical cloverleaf structure except for tRNA-Ser (AGN) which lacks a dihydrouridine (DHU) arm. Secondary structural features of the rRNA genes of Fergusonina are similar to those proposed for other insects, with minor modifications. The mitochondrial genome of Fergusonina presented here may prove valuable for resolving the sister group to the Fergusoninidae, and expands the available mtDNA data sources for acalyptrates overall.
Resumo:
This article focuses on problem solving activities in a first grade classroom in a typical small community and school in Indiana. But, the teacher and the activities in this class were not at all typical of what goes on in most comparable classrooms; and, the issues that will be addressed are relevant and important for students from kindergarten through college. Can children really solve problems that involve concepts (or skills) that they have not yet been taught? Can children really create important mathematical concepts on their own – without a lot of guidance from teachers? What is the relationship between problem solving abilities and the mastery of skills that are widely regarded as being “prerequisites” to such tasks?Can primary school children (whose toolkits of skills are limited) engage productively in authentic simulations of “real life” problem solving situations? Can three-person teams of primary school children really work together collaboratively, and remain intensely engaged, on problem solving activities that require more than an hour to complete? Are the kinds of learning and problem solving experiences that are recommended (for example) in the USA’s Common Core State Curriculum Standards really representative of the kind that even young children encounter beyond school in the 21st century? … This article offers an existence proof showing why our answers to these questions are: Yes. Yes. Yes. Yes. Yes. Yes. And: No. … Even though the evidence we present is only intended to demonstrate what’s possible, not what’s likely to occur under any circumstances, there is no reason to expect that the things that our children accomplished could not be accomplished by average ability children in other schools and classrooms.