955 resultados para Random Subspace Method


Relevância:

30.00% 30.00%

Publicador:

Resumo:

A strategy of "sequence scanning" is proposed for rapid acquisition of sequence from clones such as bacteriophage P1 clones, cosmids, or yeast artificial chromosomes. The approach makes use of a special vector, called LambdaScan, that reliably yields subclones with inserts in the size range 8-12 kb. A number of subclones, typically 96 or 192, are chosen at random, and the ends of the inserts are sequenced using vector-specific primers. Then long-range spectrum PCR is used to order and orient the clones. This combination of shotgun and directed sequencing results in a high-resolution physical map suitable for the identification of coding regions or for comparison of sequence organization among genomes. Computer simulations indicate that, for a target clone of 100 kb, the scanning of 192 subclones with sequencing reads as short as 350 bp results in an approximate ratio of 1:2:1 of regions of double-stranded sequence, single-stranded sequence, and gaps. Longer sequencing reads tip the ratio strongly toward increased double-stranded sequence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The helix-coil transition equilibrium of polypeptides in aqueous solution was studied by molecular dynamics simulation. The peptide growth simulation method was introduced to generate dynamic models of polypeptide chains in a statistical (random) coil or an alpha-helical conformation. The key element of this method is to build up a polypeptide chain during the course of a molecular transformation simulation, successively adding whole amino acid residues to the chain in a predefined conformation state (e.g., alpha-helical or statistical coil). Thus, oligopeptides of the same length and composition, but having different conformations, can be incrementally grown from a common precursor, and their relative conformational free energies can be calculated as the difference between the free energies for growing the individual peptides. This affords a straightforward calculation of the Zimm-Bragg sigma and s parameters for helix initiation and helix growth. The calculated sigma and s parameters for the polyalanine alpha-helix are in reasonable agreement with the experimental measurements. The peptide growth simulation method is an effective way to study quantitatively the thermodynamics of local protein folding.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this short review, we provide some new insights into the material synthesis and characterization of modern multi-component superconducting oxides. Two different approaches such as the high-pressure, high-temperature method and ceramic combinatorial chemistry will be reported with application to several typical examples. First, we highlight the key role of the extreme conditions in the growth of Fe-based superconductors, where a careful control of the composition-structure relation is vital for understanding the microscopic physics. The availability of high-quality LnFeAsO (Ln = lanthanide) single crystals with substitution of O by F, Sm by Th, Fe by Co, and As by P allowed us to measure intrinsic and anisotropic superconducting properties such as Hc2, Jc. Furthermore, we demonstrate that combinatorial ceramic chemistry is an efficient way to search for new superconducting compounds. A single-sample synthesis concept based on multi-element ceramic mixtures can produce a variety of local products. Such a system needs local probe analyses and separation techniques to identify compounds of interest. We present the results obtained from random mixtures of Ca, Sr, Ba, La, Zr, Pb, Tl, Y, Bi, and Cu oxides reacted at different conditions. By adding Zr but removing Tl, Y, and Bi, the bulk state superconductivity got enhanced up to about 122 K.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A two-component mixture regression model that allows simultaneously for heterogeneity and dependency among observations is proposed. By specifying random effects explicitly in the linear predictor of the mixture probability and the mixture components, parameter estimation is achieved by maximising the corresponding best linear unbiased prediction type log-likelihood. Approximate residual maximum likelihood estimates are obtained via an EM algorithm in the manner of generalised linear mixed model (GLMM). The method can be extended to a g-component mixture regression model with the component density from the exponential family, leading to the development of the class of finite mixture GLMM. For illustration, the method is applied to analyse neonatal length of stay (LOS). It is shown that identification of pertinent factors that influence hospital LOS can provide important information for health care planning and resource allocation. (C) 2002 Elsevier Science B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate whether relative contributions of genetic and shared environmental factors are associated with an increased risk in melanoma. Data from the Queensland Familial Melanoma Project comprising 15,907 subjects arising from 1912 families were analyzed to estimate the additive genetic, common and unique environmental contributions to variation in the age at onset of melanoma. Two complementary approaches for analyzing correlated time-to-onset family data were considered: the generalized estimating equations (GEE) method in which one can estimate relationship-specific dependence simultaneously with regression coefficients that describe the average population response to changing covariates; and a subject-specific Bayesian mixed model in which heterogeneity in regression parameters is explicitly modeled and the different components of variation may be estimated directly. The proportional hazards and Weibull models were utilized, as both produce natural frameworks for estimating relative risks while adjusting for simultaneous effects of other covariates. A simple Markov Chain Monte Carlo method for covariate imputation of missing data was used and the actual implementation of the Bayesian model was based on Gibbs sampling using the free ware package BUGS. In addition, we also used a Bayesian model to investigate the relative contribution of genetic and environmental effects on the expression of naevi and freckles, which are known risk factors for melanoma.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present an efficient and robust method for the calculation of all S matrix elements (elastic, inelastic, and reactive) over an arbitrary energy range from a single real-symmetric Lanczos recursion. Our new method transforms the fundamental equations associated with Light's artificial boundary inhomogeneity approach [J. Chem. Phys. 102, 3262 (1995)] from the primary representation (original grid or basis representation of the Hamiltonian or its function) into a single tridiagonal Lanczos representation, thereby affording an iterative version of the original algorithm with greatly superior scaling properties. The method has important advantages over existing iterative quantum dynamical scattering methods: (a) the numerically intensive matrix propagation proceeds with real symmetric algebra, which is inherently more stable than its complex symmetric counterpart; (b) no complex absorbing potential or real damping operator is required, saving much of the exterior grid space which is commonly needed to support these operators and also removing the associated parameter dependence. Test calculations are presented for the collinear H+H-2 reaction, revealing excellent performance characteristics. (C) 2004 American Institute of Physics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The buffer allocation problem (BAP) is a well-known difficult problem in the design of production lines. We present a stochastic algorithm for solving the BAP, based on the cross-entropy method, a new paradigm for stochastic optimization. The algorithm involves the following iterative steps: (a) the generation of buffer allocations according to a certain random mechanism, followed by (b) the modification of this mechanism on the basis of cross-entropy minimization. Through various numerical experiments we demonstrate the efficiency of the proposed algorithm and show that the method can quickly generate (near-)optimal buffer allocations for fairly large production lines.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The notorious "dimensionality curse" is a well-known phenomenon for any multi-dimensional indexes attempting to scale up to high dimensions. One well-known approach to overcome degradation in performance with respect to increasing dimensions is to reduce the dimensionality of the original dataset before constructing the index. However, identifying the correlation among the dimensions and effectively reducing them are challenging tasks. In this paper, we present an adaptive Multi-level Mahalanobis-based Dimensionality Reduction (MMDR) technique for high-dimensional indexing. Our MMDR technique has four notable features compared to existing methods. First, it discovers elliptical clusters for more effective dimensionality reduction by using only the low-dimensional subspaces. Second, data points in the different axis systems are indexed using a single B+-tree. Third, our technique is highly scalable in terms of data size and dimension. Finally, it is also dynamic and adaptive to insertions. An extensive performance study was conducted using both real and synthetic datasets, and the results show that our technique not only achieves higher precision, but also enables queries to be processed efficiently. Copyright Springer-Verlag 2005

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Chromogenic (CISH) and fluorescent ( FISH) in situ hybridization have emerged as reliable techniques to identify amplifications and chromosomal translocations. CISH provides a spatial distribution of gene copy number changes in tumour tissue and allows a direct correlation between copy number changes and the morphological features of neoplastic cells. However, the limited number of commercially available gene probes has hindered the use of this technique. We have devised a protocol to generate probes for CISH that can be applied to formalin-fixed, paraffin-embedded tissue sections (FFPETS). Bacterial artificial chromosomes ( BACs) containing fragments of human DNA which map to specific genomic regions of interest are amplified with phi 29 polymerase and random primer labelled with biotin. The genomic location of these can be readily confirmed by BAC end pair sequencing and FISH mapping on normal lymphocyte metaphase spreads. To demonstrate the reliability of the probes generated with this protocol, four strategies were employed: (i) probes mapping to cyclin D1 (CCND1) were generated and their performance was compared with that of a commercially available probe for the same gene in a series of 10 FFPETS of breast cancer samples of which five harboured CCND1 amplification; (ii) probes targeting cyclin-dependent kinase 4 were used to validate an amplification identified by microarray-based comparative genomic hybridization (aCGH) in a pleomorphic adenoma; (iii) probes targeting fibroblast growth factor receptor 1 and CCND1 were used to validate amplifications mapping to these regions, as defined by aCGH, in an invasive lobular breast carcinoma with FISH and CISH; and (iv) gene-specific probes for ETV6 and NTRK3 were used to demonstrate the presence of t(12; 15)(p12; q25) translocation in a case of breast secretory carcinoma with dual colour FISH. In summary, this protocol enables the generation of probes mapping to any gene of interest that can be applied to FFPETS, allowing correlation of morphological features with gene copy number.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we propose a novel high-dimensional index method, the BM+-tree, to support efficient processing of similarity search queries in high-dimensional spaces. The main idea of the proposed index is to improve data partitioning efficiency in a high-dimensional space by using a rotary binary hyperplane, which further partitions a subspace and can also take advantage of the twin node concept used in the M+-tree. Compared with the key dimension concept in the M+-tree, the binary hyperplane is more effective in data filtering. High space utilization is achieved by dynamically performing data reallocation between twin nodes. In addition, a post processing step is used after index building to ensure effective filtration. Experimental results using two types of real data sets illustrate a significantly improved filtering efficiency.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computer models, or simulators, are widely used in a range of scientific fields to aid understanding of the processes involved and make predictions. Such simulators are often computationally demanding and are thus not amenable to statistical analysis. Emulators provide a statistical approximation, or surrogate, for the simulators accounting for the additional approximation uncertainty. This thesis develops a novel sequential screening method to reduce the set of simulator variables considered during emulation. This screening method is shown to require fewer simulator evaluations than existing approaches. Utilising the lower dimensional active variable set simplifies subsequent emulation analysis. For random output, or stochastic, simulators the output dispersion, and thus variance, is typically a function of the inputs. This work extends the emulator framework to account for such heteroscedasticity by constructing two new heteroscedastic Gaussian process representations and proposes an experimental design technique to optimally learn the model parameters. The design criterion is an extension of Fisher information to heteroscedastic variance models. Replicated observations are efficiently handled in both the design and model inference stages. Through a series of simulation experiments on both synthetic and real world simulators, the emulators inferred on optimal designs with replicated observations are shown to outperform equivalent models inferred on space-filling replicate-free designs in terms of both model parameter uncertainty and predictive variance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The principled statistical application of Gaussian random field models used in geostatistics has historically been limited to data sets of a small size. This limitation is imposed by the requirement to store and invert the covariance matrix of all the samples to obtain a predictive distribution at unsampled locations, or to use likelihood-based covariance estimation. Various ad hoc approaches to solve this problem have been adopted, such as selecting a neighborhood region and/or a small number of observations to use in the kriging process, but these have no sound theoretical basis and it is unclear what information is being lost. In this article, we present a Bayesian method for estimating the posterior mean and covariance structures of a Gaussian random field using a sequential estimation algorithm. By imposing sparsity in a well-defined framework, the algorithm retains a subset of “basis vectors” that best represent the “true” posterior Gaussian random field model in the relative entropy sense. This allows a principled treatment of Gaussian random field models on very large data sets. The method is particularly appropriate when the Gaussian random field model is regarded as a latent variable model, which may be nonlinearly related to the observations. We show the application of the sequential, sparse Bayesian estimation in Gaussian random field models and discuss its merits and drawbacks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We analyze the stochastic creation of a single bound state (BS) in a random potential with a compact support. We study both the Hermitian Schrödinger equation and non-Hermitian Zakharov-Shabat systems. These problems are of special interest in the inverse scattering method for Korteveg–de-Vries and the nonlinear Schrödinger equations since soliton solutions of these two equations correspond to the BSs of the two aforementioned linear eigenvalue problems. Analytical expressions for the average width of the potential required for the creation of the first BS are given in the approximation of delta-correlated Gaussian potential and additionally different scenarios of eigenvalue creation are discussed for the non-Hermitian case.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis includes analysis of disordered spin ensembles corresponding to Exact Cover, a multi-access channel problem, and composite models combining sparse and dense interactions. The satisfiability problem in Exact Cover is addressed using a statistical analysis of a simple branch and bound algorithm. The algorithm can be formulated in the large system limit as a branching process, for which critical properties can be analysed. Far from the critical point a set of differential equations may be used to model the process, and these are solved by numerical integration and exact bounding methods. The multi-access channel problem is formulated as an equilibrium statistical physics problem for the case of bit transmission on a channel with power control and synchronisation. A sparse code division multiple access method is considered and the optimal detection properties are examined in typical case by use of the replica method, and compared to detection performance achieved by interactive decoding methods. These codes are found to have phenomena closely resembling the well-understood dense codes. The composite model is introduced as an abstraction of canonical sparse and dense disordered spin models. The model includes couplings due to both dense and sparse topologies simultaneously. The new type of codes are shown to outperform sparse and dense codes in some regimes both in optimal performance, and in performance achieved by iterative detection methods in finite systems.