19 resultados para Subset

em Cambridge University Engineering Department Publications Database


Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a stochastic simulation technique for subset selection in time series models, based on the use of indicator variables with the Gibbs sampler within a hierarchical Bayesian framework. As an example, the method is applied to the selection of subset linear AR models, in which only significant lags are included. Joint sampling of the indicators and parameters is found to speed convergence. We discuss the possibility of model mixing where the model is not well determined by the data, and the extension of the approach to include non-linear model terms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

MOTIVATION: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. RESULTS: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs. AVAILABILITY: If interested in the code for the work presented in this article, please contact the authors. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Four types of neural networks which have previously been established for speech recognition and tested on a small, seven-speaker, 100-sentence database are applied to the TIMIT database. The networks are a recurrent network phoneme recognizer, a modified Kanerva model morph recognizer, a compositional representation phoneme-to-word recognizer, and a modified Kanerva model morph-to-word recognizer. The major result is for the recurrent net, giving a phoneme recognition accuracy of 57% from the si and sx sentences. The Kanerva morph recognizer achieves 66.2% accuracy for a small subset of the sa and sx sentences. The results for the word recognizers are incomplete.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Arabidopsis genome contains a highly complex and abundant population of small RNAs, and many of the endogenous siRNAs are dependent on RNA-Dependent RNA Polymerase 2 (RDR2) for their biogenesis. By analyzing an rdr2 loss-of-function mutant using two different parallel sequencing technologies, MPSS and 454, we characterized the complement of miRNAs expressed in Arabidopsis inflorescence to considerable depth. Nearly all known miRNAs were enriched in this mutant and we identified 13 new miRNAs, all of which were relatively low abundance and constitute new families. Trans-acting siRNAs (ta-siRNAs) were even more highly enriched. Computational and gel blot analyses suggested that the minimal number of miRNAs in Arabidopsis is approximately 155. The size profile of small RNAs in rdr2 reflected enrichment of 21-nt miRNAs and other classes of siRNAs like ta-siRNAs, and a significant reduction in 24-nt heterochromatic siRNAs. Other classes of small RNAs were found to be RDR2-independent, particularly those derived from long inverted repeats and a subset of tandem repeats. The small RNA populations in other Arabidopsis small RNA biogenesis mutants were also examined; a dcl2/3/4 triple mutant showed a similar pattern to rdr2, whereas dcl1-7 and rdr6 showed reductions in miRNAs and ta-siRNAs consistent with their activities in the biogenesis of these types of small RNAs. Deep sequencing of mutants provides a genetic approach for the dissection and characterization of diverse small RNA populations and the identification of low abundance miRNAs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Life is full of difficult choices. Everyone has their own way of dealing with these, some effective, some not. The problem is particularly acute in engineering design because of the vast amount of information designers have to process. This paper deals with a subset of this set of problems: the subset of selecting materials and processes, and their links to the design of products. Even these, though, present many of the generic problems of choice, and the challenges in creating tools to assist the designer in making them. The key elements are those of classification, of indexing, of reaching decisions using incomplete data in many different formats, and of devising effective strategies for selection. This final element - that of selection strategies - poses particular challenges. Product design, as an example, is an intricate blend of the technical and (for want of a better word) the aesthetic. To meet these needs, a tool that allows selection by analysis, by analogy, by association and simply by 'browsing' is necessary. An example of such a tool, its successes and remaining challenges, will be described.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The assembly of any manufactured product involves joining. This paper describes ways of selecting processes for joining. The method allows discrimination of the joint geometry, joint loading, material, and other attributes of the joint itself, identifying the subset of available processes capable of meeting a given set of design constraints. A relational database containing data-tables for joining processes, materials to be joined, and joint geometry and mode of loading, allows the attributes of each of these to be stored in an appropriate format, and permits links to be created between those that are related. A search engine isolates the processes that meet design requirements on material, joint geometry and loading. The method is illustrated in Part 2 by case studies, utilising software that embodies the method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract—There are sometimes occasions when ultrasound beamforming is performed with only a subset of the total data that will eventually be available. The most obvious example is a mechanically-swept (wobbler) probe in which the three-dimensional data block is formed from a set of individual B-scans. In these circumstances, non-blind deconvolution can be used to improve the resolution of the data. Unfortunately, most of these situations involve large blocks of three-dimensional data. Furthermore, the ultrasound blur function varies spatially with distance from the transducer. These two facts make the deconvolution process time-consuming to implement. This paper is about ways to address this problem and produce spatially-varying deconvolution of large blocks of three-dimensional data in a matter of seconds. We present two approaches, one based on hardware and the other based on software. We compare the time they each take to achieve similar results and discuss the computational resources and form of blur model that each requires.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We introduce a Gaussian process model of functions which are additive. An additive function is one which decomposes into a sum of low-dimensional functions, each depending on only a subset of the input variables. Additive GPs generalize both Generalized Additive Models, and the standard GP models which use squared-exponential kernels. Hyperparameter learning in this model can be seen as Bayesian Hierarchical Kernel Learning (HKL). We introduce an expressive but tractable parameterization of the kernel function, which allows efficient evaluation of all input interaction terms, whose number is exponential in the input dimension. The additional structure discoverable by this model results in increased interpretability, as well as state-of-the-art predictive power in regression tasks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We derive a random-coding upper bound on the average probability of error of joint source-channel coding that recovers Csiszár's error exponent when used with product distributions over the channel inputs. Our proof technique for the error probability analysis employs a code construction for which source messages are assigned to subsets and codewords are generated with a distribution that depends on the subset. © 2012 IEEE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

When searching for characteristic subpatterns in potentially noisy graph data, it appears self-evident that having multiple observations would be better than having just one. However, it turns out that the inconsistencies introduced when different graph instances have different edge sets pose a serious challenge. In this work we address this challenge for the problem of finding maximum weighted cliques. We introduce the concept of most persistent soft-clique. This is subset of vertices, that 1) is almost fully or at least densely connected, 2) occurs in all or almost all graph instances, and 3) has the maximum weight. We present a measure of clique-ness, that essentially counts the number of edge missing to make a subset of vertices into a clique. With this measure, we show that the problem of finding the most persistent soft-clique problem can be cast either as: a) a max-min two person game optimization problem, or b) a min-min soft margin optimization problem. Both formulations lead to the same solution when using a partial Lagrangian method to solve the optimization problems. By experiments on synthetic data and on real social network data we show that the proposed method is able to reliably find soft cliques in graph data, even if that is distorted by random noise or unreliable observations. Copyright 2012 by the author(s)/owner(s).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Model predictive control allows systematic handling of physical and operational constraints through the use of constrained optimisation. It has also been shown to successfully exploit plant redundancy to maintain a level of control in scenarios when faults are present. Unfortunately, the computational complexity of each individual iteration of the algorithm to solve the optimisation problem scales cubically with the number of plant inputs, so the computational demands are high for large MIMO plants. Multiplexed MPC only calculates changes in a subset of the plant inputs at each sampling instant, thus reducing the complexity of the optimisation. This paper demonstrates the application of multiplexed model predictive control to a large transport airliner in a nominal and a contingency scenario. The performance is compared to that obtained with a conventional synchronous model predictive controller, designed using an equivalent cost function. © 2012 AACC American Automatic Control Council).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this study a 5-step reduced chemical kinetic mechanism involving nine species is developed for combustion of Blast Furnace Gas (BFG), a multi-component fuel containing CO/H2/CH4/CO2, typically with low hydrogen, methane and high water fractions, for conditions relevant for stationary gas-turbine combustion. This reduced mechanism is obtained from a 49-reaction skeletal mechanism which is a modified subset of GRI Mech 3.0. The skeletal and reduced mechanisms are validated for laminar flame speeds, ignition delay times and flame structure with available experimental data, and using computational results with a comprehensive set of elementary reactions. Overall, both the skeletal and reduced mechanisms show a very good agreement over a wide range of pressure, reactant temperature and fuel mixture composition. © 2012 The Combustion Institute..

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Identifying strategies for reducing greenhouse gas emissions from steel production requires a comprehensive model of the sector but previous work has either failed to consider the whole supply chain or considered only a subset of possible abatement options. In this work, a global mass flow analysis is combined with process emissions intensities to allow forecasts of future steel sector emissions under all abatement options. Scenario analysis shows that global capacity for primary steel production is already near to a peak and that if sectoral emissions are to be reduced by 50% by 2050, the last required blast furnace will be built by 2020. Emissions reduction targets cannot be met by energy and emissions efficiency alone, but deploying material efficiency provides sufficient extra abatement potential.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose an algorithm for solving optimization problems defined on a subset of the cone of symmetric positive semidefinite matrices. This algorithm relies on the factorization X = Y Y T , where the number of columns of Y fixes an upper bound on the rank of the positive semidefinite matrix X. It is thus very effective for solving problems that have a low-rank solution. The factorization X = Y Y T leads to a reformulation of the original problem as an optimization on a particular quotient manifold. The present paper discusses the geometry of that manifold and derives a second-order optimization method with guaranteed quadratic convergence. It furthermore provides some conditions on the rank of the factorization to ensure equivalence with the original problem. In contrast to existing methods, the proposed algorithm converges monotonically to the sought solution. Its numerical efficiency is evaluated on two applications: the maximal cut of a graph and the problem of sparse principal component analysis. © 2010 Society for Industrial and Applied Mathematics.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We introduce a conceptually novel structured prediction model, GPstruct, which is kernelized, non-parametric and Bayesian, by design. We motivate the model with respect to existing approaches, among others, conditional random fields (CRFs), maximum margin Markov networks (M3N), and structured support vector machines (SVMstruct), which embody only a subset of its properties. We present an inference procedure based on Markov Chain Monte Carlo. The framework can be instantiated for a wide range of structured objects such as linear chains, trees, grids, and other general graphs. As a proof of concept, the model is benchmarked on several natural language processing tasks and a video gesture segmentation task involving a linear chain structure. We show prediction accuracies for GPstruct which are comparable to or exceeding those of CRFs and SVMstruct.