97 resultados para computer algorithm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Read-only-memory-based (ROM-based) quantum computation (QC) is an alternative to oracle-based QC. It has the advantages of being less magical, and being more suited to implementing space-efficient computation (i.e., computation using the minimum number of writable qubits). Here we consider a number of small (one- and two-qubit) quantum algorithms illustrating different aspects of ROM-based QC. They are: (a) a one-qubit algorithm to solve the Deutsch problem; (b) a one-qubit binary multiplication algorithm; (c) a two-qubit controlled binary multiplication algorithm; and (d) a two-qubit ROM-based version of the Deutsch-Jozsa algorithm. For each algorithm we present experimental verification using nuclear magnetic resonance ensemble QC. The average fidelities for the implementation were in the ranges 0.9-0.97 for the one-qubit algorithms, and 0.84-0.94 for the two-qubit algorithms. We conclude with a discussion of future prospects for ROM-based quantum computation. We propose a four-qubit algorithm, using Grover's iterate, for solving a miniature real-world problem relating to the lengths of paths in a network.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a novel maximum-likelihood-based algorithm for estimating the distribution of alignment scores from the scores of unrelated sequences in a database search. Using a new method for measuring the accuracy of p-values, we show that our maximum-likelihood-based algorithm is more accurate than existing regression-based and lookup table methods. We explore a more sophisticated way of modeling and estimating the score distributions (using a two-component mixture model and expectation maximization), but conclude that this does not improve significantly over simply ignoring scores with small E-values during estimation. Finally, we measure the classification accuracy of p-values estimated in different ways and observe that inaccurate p-values can, somewhat paradoxically, lead to higher classification accuracy. We explain this paradox and argue that statistical accuracy, not classification accuracy, should be the primary criterion in comparisons of similarity search methods that return p-values that adjust for target sequence length.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new algorithm has been developed for smoothing the surfaces in finite element formulations of contact-impact. A key feature of this method is that the smoothing is done implicitly by constructing smooth signed distance functions for the bodies. These functions are then employed for the computation of the gap and other variables needed for implementation of contact-impact. The smoothed signed distance functions are constructed by a moving least-squares approximation with a polynomial basis. Results show that when nodes are placed on a surface, the surface can be reproduced with an error of about one per cent or less with either a quadratic or a linear basis. With a quadratic basis, the method exactly reproduces a circle or a sphere even for coarse meshes. Results are presented for contact problems involving the contact of circular bodies. Copyright (C) 2002 John Wiley Sons, Ltd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently, several groups have investigated quantum analogues of random walk algorithms, both on a line and on a circle. It has been found that the quantum versions have markedly different features to the classical versions. Namely, the variance on the line, and the mixing time on the circle increase quadratically faster in the quantum versions as compared to the classical versions. Here, we propose a scheme to implement the quantum random walk on a line and on a circle in an ion trap quantum computer. With current ion trap technology, the number of steps that could be experimentally implemented will be relatively small. However, we show how the enhanced features of these walks could be observed experimentally. In the limit of strong decoherence, the quantum random walk tends to the classical random walk. By measuring the degree to which the walk remains quantum, '' this algorithm could serve as an important benchmarking protocol for ion trap quantum computers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper outlines research on the processes taking place within the coal mineral matter at high temperatures and development of the relationship between ash fusion temperatures (AFT) and phase equilibria of the coal ash slags. A new thermodynamic database for the Al-Ca-Fe-O-Si system developed by the author was used in conjunction with the thermodynamic computer package F*A*C*T for these purposes. In addition, high temperature experimental studies were undertaken that involved heat treatment and quenching of the ash cones followed by the analyses using different techniques. The study provided new information on the processes taking place during AFT test and demonstrated the validity of the AFTs predictions with F*A*C*T. Examples of practical applications of the AFT prediction method are given in the paper. The results of this study are important not only for the AFT predictions, but also in general for the application of phase equilibrium science to the characterisation of the coal mineral matter interactions at high temperature. (C) 2002 Elsevier Science Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study integrated the research streams of computer-mediated communication (CMC) and group conflict by comparing the expression of different types of conflict in CMC groups and face-to face (FTF) groups over time. The main aim of the study was to compare the cues-filtered-out approach against the social information processing theory A laboratory study was conducted with 39 groups (19 CMC and 20 FTF) in which members were required to work together over three sessions. The frequencies of task, process, and relationship conflict were analyzed. Findings supported the social information processing theory. There was more process and relationship conflict in CMC groups compared to FTF groups on Day 1. However, this difference disappeared on Days 2 and 3. There was no difference between CMC and FTF groups in the amount of task conflict expressed on any day.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Libraries of cyclic peptides are being synthesized using combinatorial chemistry for high throughput screening in the drug discovery process. This paper describes the min_syn_steps.cpp program (available at http://www.imb.uq.edu.au/groups/smythe/tran), which after inputting a list of cyclic peptides to be synthesized, removes cyclic redundant sequences and calculates synthetic strategies which minimize the synthetic steps as well as the reagent requirements. The synthetic steps and reagent requirements could be minimized by finding common subsets within the sequences for block synthesis. Since a brute-force approach to search for optimum synthetic strategies is impractically large, a subset-orientated approach is utilized here to limit the size of the search. (C) 2002 Elsevier Science Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We focus on mixtures of factor analyzers from the perspective of a method for model-based density estimation from high-dimensional data, and hence for the clustering of such data. This approach enables a normal mixture model to be fitted to a sample of n data points of dimension p, where p is large relative to n. The number of free parameters is controlled through the dimension of the latent factor space. By working in this reduced space, it allows a model for each component-covariance matrix with complexity lying between that of the isotropic and full covariance structure models. We shall illustrate the use of mixtures of factor analyzers in a practical example that considers the clustering of cell lines on the basis of gene expressions from microarray experiments. (C) 2002 Elsevier Science B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Computer Science is a subject which has difficulty in marketing itself. Further, pinning down a standard curriculum is difficult-there are many preferences which are hard to accommodate. This paper argues the case that part of the problem is the fact that, unlike more established disciplines, the subject does not clearly distinguish the study of principles from the study of artifacts. This point was raised in Curriculum 2001 discussions, and debate needs to start in good time for the next curriculum standard. This paper provides a starting point for debate, by outlining a process by which principles and artifacts may be separated, and presents a sample curriculum to illustrate the possibilities. This sample curriculum has some positive points, though these positive points are incidental to the need to start debating the issue. Other models, with a less rigorous ordering of principles before artifacts, would still gain from making it clearer whether a specific concept was fundamental, or a property of a specific technology. (C) 2003 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In microarray studies, the application of clustering techniques is often used to derive meaningful insights into the data. In the past, hierarchical methods have been the primary clustering tool employed to perform this task. The hierarchical algorithms have been mainly applied heuristically to these cluster analysis problems. Further, a major limitation of these methods is their inability to determine the number of clusters. Thus there is a need for a model-based approach to these. clustering problems. To this end, McLachlan et al. [7] developed a mixture model-based algorithm (EMMIX-GENE) for the clustering of tissue samples. To further investigate the EMMIX-GENE procedure as a model-based -approach, we present a case study involving the application of EMMIX-GENE to the breast cancer data as studied recently in van 't Veer et al. [10]. Our analysis considers the problem of clustering the tissue samples on the basis of the genes which is a non-standard problem because the number of genes greatly exceed the number of tissue samples. We demonstrate how EMMIX-GENE can be useful in reducing the initial set of genes down to a more computationally manageable size. The results from this analysis also emphasise the difficulty associated with the task of separating two tissue groups on the basis of a particular subset of genes. These results also shed light on why supervised methods have such a high misallocation error rate for the breast cancer data.