914 resultados para High-dimensional index structure
Resumo:
In this paper, we develop a novel index structure to support efficient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance the BID mechanism with clustering, cluster adapted bitcoder and dimensional weight, named the BID⁺. Extensive experiments are conducted to show that our proposed method yields significant performance advantages over the existing index structures on both real life and synthetic high-dimensional datasets.
Resumo:
In this paper, we propose a novel high-dimensional index method, the BM+-tree, to support efficient processing of similarity search queries in high-dimensional spaces. The main idea of the proposed index is to improve data partitioning efficiency in a high-dimensional space by using a rotary binary hyperplane, which further partitions a subspace and can also take advantage of the twin node concept used in the M+-tree. Compared with the key dimension concept in the M+-tree, the binary hyperplane is more effective in data filtering. High space utilization is achieved by dynamically performing data reallocation between twin nodes. In addition, a post processing step is used after index building to ensure effective filtration. Experimental results using two types of real data sets illustrate a significantly improved filtering efficiency.
Resumo:
The three-dimensional solution structure of conotoxin TVIIA, a 30-residue polypeptide from the venom of the piscivorous cone snail Conus tulipa, has been determined using 2D H-1 NMR spectroscopy. TVIIA contains six cysteine residues which form a 'four-loop' structural framework common to many peptides from Conus venoms including the omega-, delta-, kappa-, and mu O-conotoxins. However, TVIIA does not belong to these well-characterized pharmacological classes of conotoxins, but displays high sequence identity with conotoxin GS, a muscle sodium channel blocker from Conus geographus. Structure calculations were based on 562 interproton distance restraints inferred from NOE data, together with 18 backbone and nine side-chain torsion angle restraints derived from spin-spin coupling constants. The final family of 20 structures had mean pairwise rms differences over residues 2-27 of 0.18 +/- 0.05 Angstrom for the backbone atoms and 1.39 +/- 0.33 Angstrom for all heavy atoms. The structure consists of a triple-stranded, antiparallel beta sheet with +2x, -1 topology (residues 7-9, 16-20 and 23-27) and several beta turns. The core of the molecule is formed by three disulfide bonds which form a cystine knot motif common to many toxic and inhibitory polypeptides. The global fold, molecular shape and distribution of amino-acid sidechains in TVIIA is similar to that previously reported for conotoxin GS, and comparison with other four-loop conotoxin structures provides further indication that TVIIA and GS represent a new and distinct subgroup of this structural family. The structure of TVIIA determined in this study provides the basis for determining a structure-activity relationship for these molecules and their interaction with target receptors.
Resumo:
We focus on mixtures of factor analyzers from the perspective of a method for model-based density estimation from high-dimensional data, and hence for the clustering of such data. This approach enables a normal mixture model to be fitted to a sample of n data points of dimension p, where p is large relative to n. The number of free parameters is controlled through the dimension of the latent factor space. By working in this reduced space, it allows a model for each component-covariance matrix with complexity lying between that of the isotropic and full covariance structure models. We shall illustrate the use of mixtures of factor analyzers in a practical example that considers the clustering of cell lines on the basis of gene expressions from microarray experiments. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
Transport properties of GaAs / δ – Mn / GaAs / InxGa1-xAs / GaAs structure with Mn δ – layer, which is separated from InxGa1-xAs quantum well (QW) by 3 nm thick GaAs spacer was investigated. This structure with high mobility was characterized by X-ray difractometry and reflectometry. Transport and electrical properties of the structure were measured by using Pulsed Magnetic Field System (PMFS). During investigation of the Shubnikov – de Haas and the Hall effects the main parameters of QW structure such as cyclotron mass, Fermi level, g – factor, Dingle temperature and concentration of holes were estimated. Obtained results show high quality of the prepared structure. However, anomalous Hall effect at temperatures 2.09 K, 3 K, 4.2 K is not clearly observed. Attempts to identify magnetic moment were made. For this purpose the polarity of the filed was changed to the opposite at each shot. As a result hysteresis loop was not observed in the magnetic field dependences of the anomalous Hall resistivity.This can be attributed to the imperfection of the experimental setup.
Resumo:
In order to extend previous SAR and QSAR studies, 3D-QSAR analysis has been performed using CoMFA and CoMSIA approaches applied to a set of 39 alpha-(N)-heterocyclic carboxaldehydes thiosemicarbazones with their inhibitory activity values (IC(50)) evaluated against ribonucleotide reductase (RNR) of H.Ep.-2 cells (human epidermoid carcinoma), taken from selected literature. Both rigid and field alignment methods, taking the unsubstituted 2-formylpyridine thiosemicarbazone in its syn conformation as template, have been used to generate multiple predictive CoMFA and CoMSIA models derived from training sets and validated with the corresponding test sets. Acceptable predictive correlation coefficients (Q(cv)(2) from 0.360 to 0.609 for CoMFA and Q(cv)(2) from 0.394 to 0.580 for CoMSIA models) with high fitted correlation coefficients (r` from 0.881 to 0.981 for CoMFA and r(2) from 0.938 to 0.993 for CoMSIA models) and low standard errors (s from 0.135 to 0.383 for CoMFA and s from 0.098 to 0.240 for CoMSIA models) were obtained. More precise CoMFA and CoMSIA models have been derived considering the subset of thiosemicarbazones (TSC) substituted only at 5-position of the pyridine ring (n=22). Reasonable predictive correlation coefficients (Q(cv)(2) from 0.486 to 0.683 for CoMFA and Q(cv)(2) from 0.565 to 0.791 for CoMSIA models) with high fitted correlation coefficients (r(2) from 0.896 to 0.997 for CoMFA and r(2) from 0.991 to 0.998 for CoMSIA models) and very low standard errors (s from 0.040 to 0.179 for CoMFA and s from 0.029 to 0.068 for CoMSIA models) were obtained. The stability of each CoMFA and CoMSIA models was further assessed by performing bootstrapping analysis. For the two sets the generated CoMSIA models showed, in general, better statistics than the corresponding CoMFA models. The analysis of CoMFA and CoMSIA contour maps suggest that a hydrogen bond acceptor near the nitrogen of the pyridine ring can enhance inhibitory activity values. This observation agrees with literature data, which suggests that the nitrogen pyridine lone pairs can complex with the iron ion leading to species that inhibits RNR. The derived CoMFA and CoMSIA models contribute to understand the structural features of this class of TSC as antitumor agents in terms of steric, electrostatic, hydrophobic and hydrogen bond donor and hydrogen bond acceptor fields as well as to the rational design of this key enzyme inhibitors.
Resumo:
The notorious "dimensionality curse" is a well-known phenomenon for any multi-dimensional indexes attempting to scale up to high dimensions. One well-known approach to overcome degradation in performance with respect to increasing dimensions is to reduce the dimensionality of the original dataset before constructing the index. However, identifying the correlation among the dimensions and effectively reducing them are challenging tasks. In this paper, we present an adaptive Multi-level Mahalanobis-based Dimensionality Reduction (MMDR) technique for high-dimensional indexing. Our MMDR technique has four notable features compared to existing methods. First, it discovers elliptical clusters for more effective dimensionality reduction by using only the low-dimensional subspaces. Second, data points in the different axis systems are indexed using a single B+-tree. Third, our technique is highly scalable in terms of data size and dimension. Finally, it is also dynamic and adaptive to insertions. An extensive performance study was conducted using both real and synthetic datasets, and the results show that our technique not only achieves higher precision, but also enables queries to be processed efficiently. Copyright Springer-Verlag 2005
Resumo:
Indexing high dimensional datasets has attracted extensive attention from many researchers in the last decade. Since R-tree type of index structures are known as suffering curse of dimensionality problems, Pyramid-tree type of index structures, which are based on the B-tree, have been proposed to break the curse of dimensionality. However, for high dimensional data, the number of pyramids is often insufficient to discriminate data points when the number of dimensions is high. Its effectiveness degrades dramatically with the increase of dimensionality. In this paper, we focus on one particular issue of curse of dimensionality; that is, the surface of a hypercube in a high dimensional space approaches 100% of the total hypercube volume when the number of dimensions approaches infinite. We propose a new indexing method based on the surface of dimensionality. We prove that the Pyramid tree technology is a special case of our method. The results of our experiments demonstrate clear priority of our novel method.
Resumo:
Since multimedia data, such as images and videos, are way more expressive and informative than ordinary text-based data, people find it more attractive to communicate and express with them. Additionally, with the rising popularity of social networking tools such as Facebook and Twitter, multimedia information retrieval can no longer be considered a solitary task. Rather, people constantly collaborate with one another while searching and retrieving information. But the very cause of the popularity of multimedia data, the huge and different types of information a single data object can carry, makes their management a challenging task. Multimedia data is commonly represented as multidimensional feature vectors and carry high-level semantic information. These two characteristics make them very different from traditional alpha-numeric data. Thus, to try to manage them with frameworks and rationales designed for primitive alpha-numeric data, will be inefficient. An index structure is the backbone of any database management system. It has been seen that index structures present in existing relational database management frameworks cannot handle multimedia data effectively. Thus, in this dissertation, a generalized multidimensional index structure is proposed which accommodates the atypical multidimensional representation and the semantic information carried by different multimedia data seamlessly from within one single framework. Additionally, the dissertation investigates the evolving relationships among multimedia data in a collaborative environment and how such information can help to customize the design of the proposed index structure, when it is used to manage multimedia data in a shared environment. Extensive experiments were conducted to present the usability and better performance of the proposed framework over current state-of-art approaches.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Continuing our series of papers on the three-dimensional (3D) structure and accurate distances of planetary nebulae (PNe), we present here the results obtained for PN NGC 40. Using data from different sources and wavelengths, we construct 3D photoionization models and derive the physical quantities of the ionizing source and nebular gas. The procedure, discussed in detail in the previous papers, consists of the use of 3D photoionization codes constrained by observational data to derive the 3D nebular structure, physical and chemical characteristics, and ionizing star parameters of the objects by simultaneously fitting the integrated line intensities, the density map, the temperature map, and the observed morphologies in different emission lines. For this particular case we combined hydrodynamical simulations with the photoionization scheme in order to obtain self-consistent distributions of density and velocity of the nebular material. Combining the velocity field with the emission-line cubes we also obtained the synthetic position-velocity plots that are compared to the observations. Finally, using theoretical evolutionary tracks of intermediate-and low-mass stars, we derive the mass and age of the central star of NGC 40 as (0.567 +/- 0.06) M(circle dot) and (5810 +/- 600) yr, respectively. The distance obtained from the fitting procedure was (1150 +/- 120) pc.
Resumo:
alpha-Conotoxin MII, a 16-residue polypeptide from the venom of the piscivorous cone snail Conus magus, is a potent and highly specific blocker of mammalian neuronal nicotinic acetylcholine receptors composed of alpha 3 beta 2 subunits. The role of this receptor type in the modulation of neurotransmitter release and its relevance to the problems of addiction and psychosis emphasize the importance of a structural understanding of the mode of interaction of MII with the alpha 3 beta 2 interface. Here we describe the three-dimensional solution structure of MIT determined using 2D H-1 NMR spectroscopy. Structural restraints consisting of 376 interproton distances inferred from NOEs and 12 dihedral restraints derived from spin-spin coupling constants were used as input for simulated annealing calculations and energy minimization in the program X-PLOR. The final set of 20 structures is exceptionally well-defined with mean pairwise rms differences over the whole molecule of 0.07 Angstrom for the backbone atoms and 0.34 Angstrom for all heavy atoms. MII adopts a compact structure incorporating a central segment of alpha-helix and beta-turns at the N- and C-termini. The molecule is stabilized by two disulfide bonds, which provide cross-links between the N-terminus and both the middle and C-terminus of the structure. The susceptibility of the structure to conformational change was examined using several different solvent conditions. While the global fold of MII remains the same, the structure is stabilized in a more hydrophobic environment provided by the addition of acetonitrile or trifluoroethanol to the aqueous solution. The distribution of amino acid side chains in MII creates distinct hydrophobic and polar patches on its surface that may be important for the specific interaction with the alpha 3 beta 2 neuronal nAChR. A comparison of the structure of MII with other neuronal-specific alpha-conotoxins provides insights into their mode of interaction with these receptors.
Resumo:
Background: UV radiation is the major environmental factor related to development of cutaneous melanoma. Besides sun exposure and the influence of latitude, some host characteristics such as skin phototype and hair and eye color are also risk factors for melanoma. Polymorphisms in DNA repair genes could be good candidates for susceptibility genes, mainly in geographical regions exposed to high solar radiation. Objective: Evaluate the role of host characteristic.; and DNA repair polymorphism in melanoma risk in Brazil. Methods: We carried out a hospital-based case-control study in Brazil to evaluate the contribution of host factors and polymorphisms in DNA repair to melanoma risk. A total of 412 patients (202 with melanoma and 210 controls) were analyzed regarding host characteristics for melanoma risk as well as for 11 polymorphisms in DNA repair genes. Results: We found an association of host characteristics with melanoma development, such as eye and hair color, fair skin, history of pigmented lesions removed, sunburns in childhood and adolescence, and also European ancestry. Regarding DNA repair gene polymorphisms, we found protection for the XPG 1104 His/His genotype (OR 0.32; 95% CI 0.13-0.75), and increased risk for three polymorphisms in the XPC gene (PAT+; IV-6A and 939Gln), which represent a haplotype for XPC. Melanoma risk was higher in individuals carrying the complete XPC haplotype than each individual polymorphism (OR 3.64; 95% CI 1.77-7.48). Conclusions: Our data indicate that the host factors European ancestry and XPC polymorphisms contributed to melanoma risk in a region exposed to high sun radiation. (C) 2011 Japanese Society for Investigative Dermatology. Published by Elsevier Ireland Ltd. All rights reserved.
Resumo:
NMR spectroscopy and simulated annealing calculations have been used to determine the three-dimensional structure of NaD1, a novel antifungal and insecticidal protein isolated from the flowers of Nicotiana alata. NaD1 is a basic, cysteine-rich protein of 47 residues and is the first example of a plant defensin from flowers to be characterized structurally. Its three-dimensional structure consists of an a-helix and a triple-stranded anti-parallel beta-sheet that are stabilized by four intramolecular disulfide bonds. NaD1 features all the characteristics of the cysteine-stabilized up motif that has been described for a variety of proteins of differing functions ranging from antibacterial insect defensins and ion channel-perturbing scorpion toxins to an elicitor of the sweet taste response. The protein is biologically active against insect pests, which makes it a potential candidate for use in crop protection. NaD1 shares 31% sequence identity with alfAFP, an antifungal protein from alfalfa that confers resistance to a fungal pathogen in transgenic potatoes. The structure of NaD1 was used to obtain a homology model of alfAFP, since NaD1 has the highest level of sequence identity with alfAFP of any structurally characterized antifungal defensin. The structures of NaD1 and alfAFP were used in conjunction with structure - activity data for the radish defensin Rs-AFP2 to provide an insight into structure-function relationships. In particular, a putative effector site was identified in the structure of NaD1 and in the corresponding homology model of alfAFP. (C) 2002 Elsevier Science Ltd. All rights reserved.