956 resultados para Cluster Counting Algorithm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation: A consensus sequence for a family of related sequences is, as the name suggests, a sequence that captures the features common to most members of the family. Consensus sequences are important in various DNA sequencing applications and are a convenient way to characterize a family of molecules. Results: This paper describes a new algorithm for finding a consensus sequence, using the popular optimization method known as simulated annealing. Unlike the conventional approach of finding a consensus sequence by first forming a multiple sequence alignment, this algorithm searches for a sequence that minimises the sum of pairwise distances to each of the input sequences. The resulting consensus sequence can then be used to induce a multiple sequence alignment. The time required by the algorithm scales linearly with the number of input sequences and quadratically with the length of the consensus sequence. We present results demonstrating the high quality of the consensus sequences and alignments produced by the new algorithm. For comparison, we also present similar results obtained using ClustalW. The new algorithm outperforms ClustalW in many cases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new algorithm has been developed for smoothing the surfaces in finite element formulations of contact-impact. A key feature of this method is that the smoothing is done implicitly by constructing smooth signed distance functions for the bodies. These functions are then employed for the computation of the gap and other variables needed for implementation of contact-impact. The smoothed signed distance functions are constructed by a moving least-squares approximation with a polynomial basis. Results show that when nodes are placed on a surface, the surface can be reproduced with an error of about one per cent or less with either a quadratic or a linear basis. With a quadratic basis, the method exactly reproduces a circle or a sphere even for coarse meshes. Results are presented for contact problems involving the contact of circular bodies. Copyright (C) 2002 John Wiley Sons, Ltd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Libraries of cyclic peptides are being synthesized using combinatorial chemistry for high throughput screening in the drug discovery process. This paper describes the min_syn_steps.cpp program (available at http://www.imb.uq.edu.au/groups/smythe/tran), which after inputting a list of cyclic peptides to be synthesized, removes cyclic redundant sequences and calculates synthetic strategies which minimize the synthetic steps as well as the reagent requirements. The synthetic steps and reagent requirements could be minimized by finding common subsets within the sequences for block synthesis. Since a brute-force approach to search for optimum synthetic strategies is impractically large, a subset-orientated approach is utilized here to limit the size of the search. (C) 2002 Elsevier Science Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In microarray studies, the application of clustering techniques is often used to derive meaningful insights into the data. In the past, hierarchical methods have been the primary clustering tool employed to perform this task. The hierarchical algorithms have been mainly applied heuristically to these cluster analysis problems. Further, a major limitation of these methods is their inability to determine the number of clusters. Thus there is a need for a model-based approach to these. clustering problems. To this end, McLachlan et al. [7] developed a mixture model-based algorithm (EMMIX-GENE) for the clustering of tissue samples. To further investigate the EMMIX-GENE procedure as a model-based -approach, we present a case study involving the application of EMMIX-GENE to the breast cancer data as studied recently in van 't Veer et al. [10]. Our analysis considers the problem of clustering the tissue samples on the basis of the genes which is a non-standard problem because the number of genes greatly exceed the number of tissue samples. We demonstrate how EMMIX-GENE can be useful in reducing the initial set of genes down to a more computationally manageable size. The results from this analysis also emphasise the difficulty associated with the task of separating two tissue groups on the basis of a particular subset of genes. These results also shed light on why supervised methods have such a high misallocation error rate for the breast cancer data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Lanczos algorithm is appreciated in many situations due to its speed. and economy of storage. However, the advantage that the Lanczos basis vectors need not be kept is lost when the algorithm is used to compute the action of a matrix function on a vector. Either the basis vectors need to be kept, or the Lanczos process needs to be applied twice. In this study we describe an augmented Lanczos algorithm to compute a dot product relative to a function of a large sparse symmetric matrix, without keeping the basis vectors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article presents Monte Carlo techniques for estimating network reliability. For highly reliable networks, techniques based on graph evolution models provide very good performance. However, they are known to have significant simulation cost. An existing hybrid scheme (based on partitioning the time space) is available to speed up the simulations; however, there are difficulties with optimizing the important parameter associated with this scheme. To overcome these difficulties, a new hybrid scheme (based on partitioning the edge set) is proposed in this article. The proposed scheme shows orders of magnitude improvement of performance over the existing techniques in certain classes of network. It also provides reliability bounds with little overhead.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A Combined Genetic Algorithm and Method of Moments design methods is presented for the design of unusual near-field antennas for use in Magnetic Resonance Imaging systems. The method is successfully applied to the design of an asymmetric coil structure for use at 190MHz and demonstrates excellent radiofrequency field homogeneity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The design of randomized controlled trials entails decisions that have economic as well as statistical implications. In particular, the choice of an individual or cluster randomization design may affect the cost of achieving the desired level of power, other things being equal. Furthermore, if cluster randomization is chosen, the researcher must decide how to balance the number of clusters, or sites, and the size of each site. This article investigates these interrelated statistical and economic issues. Its principal purpose is to elucidate the statistical and economic trade-offs to assist researchers to employ randomized controlled trials that have desired economic, as well as statistical, properties. (C) 2003 Elsevier Inc. All rights reserved.