911 resultados para Query complexity
Resumo:
In many advanced applications, data are described by multiple high-dimensional features. Moreover, different queries may weight these features differently; some may not even specify all the features. In this paper, we propose our solution to support efficient query processing in these applications. We devise a novel representation that compactly captures f features into two components: The first component is a 2D vector that reflects a distance range ( minimum and maximum values) of the f features with respect to a reference point ( the center of the space) in a metric space and the second component is a bit signature, with two bits per dimension, obtained by analyzing each feature's descending energy histogram. This representation enables two levels of filtering: The first component prunes away points that do not share similar distance ranges, while the bit signature filters away points based on the dimensions of the relevant features. Moreover, the representation facilitates the use of a single index structure to further speed up processing. We employ the classical B+-tree for this purpose. We also propose a KNN search algorithm that exploits the access orders of critical dimensions of highly selective features and partial distances to prune the search space more effectively. Our extensive experiments on both real-life and synthetic data sets show that the proposed solution offers significant performance advantages over sequential scan and retrieval methods using single and multiple VA-files.
Resumo:
A number of proteins are activated by stress stimuli but none so spectacularly or with the degree of complexity as the tumour suppressor p53 (human p53 gene or protein). Once stabilized, p53 is responsible for the transcriptional activation of a series of proteins involved in cell cycle control, apoptosis and senescence. This protein is present at low levels in resting cells but after exposure to DNA-damaging agents and other stress stimuli it is stabilized and activated by a series of post-translational modifications that free it from MDM2 (mouse double minute 2 but used interchangeably to denote human also), a ubiquination ligase that ubiquitinates it prior to proteasome degradation. The stability of p53 is also influenced by a series of other interacting proteins. In this review, we discuss the post-translational modifications to p53 in response to different stresses and the consequences of these changes.
Resumo:
Spatial data has now been used extensively in the Web environment, providing online customized maps and supporting map-based applications. The full potential of Web-based spatial applications, however, has yet to be achieved due to performance issues related to the large sizes and high complexity of spatial data. In this paper, we introduce a multiresolution approach to spatial data management and query processing such that the database server can choose spatial data at the right resolution level for different Web applications. One highly desirable property of the proposed approach is that the server-side processing cost and network traffic can be reduced when the level of resolution required by applications are low. Another advantage is that our approach pushes complex multiresolution structures and algorithms into the spatial database engine. That is, the developer of spatial Web applications needs not to be concerned with such complexity. This paper explains the basic idea, technical feasibility and applications of multiresolution spatial databases.
Resumo:
A progressive spatial query retrieves spatial data based on previous queries (e.g., to fetch data in a more restricted area with higher resolution). A direct query, on the other side, is defined as an isolated window query. A multi-resolution spatial database system should support both progressive queries and traditional direct queries. It is conceptually challenging to support both types of query at the same time, as direct queries favour location-based data clustering, whereas progressive queries require fragmented data clustered by resolutions. Two new scaleless data structures are proposed in this paper. Experimental results using both synthetic and real world datasets demonstrate that the query processing time based on the new multiresolution approaches is comparable and often better than multi-representation data structures for both types of queries.
Resumo:
Explanations of the difficulty of relative-clause sentences implicate complexity but the measurement of complexity remains controversial. Four experiments investigated how far relational complexity (RC) theory, that has been found valid for cognitive development and human reasoning, accounts for the difficulty of 16 types of English, object- and subject-extracted relative-clause constructions. RC corresponds to the number of nouns assigned to thematic roles in the same decision. Complexity estimates based on RC and those based on maximal integration cost (MIC) were strongly correlated and accounted for similar variance in sentence difficulty (subjective ratings, comprehension accuracy, reading times). Consistent with RC theory, sentences that required more than 4 role assignments in the same decision were extremely difficult for many participants. Performance on nonlinguistic relational tasks predicted comprehension of object-extracted sentences, before and after controlling for subject-extractions. Working memory tasks predicted comprehension of object-extractions before controlling for subjectextractions. The studies extend the RC approach to a linguistic domain.
Resumo:
Semantic data models provide a map of the components of an information system. The characteristics of these models affect their usefulness for various tasks (e.g., information retrieval). The quality of information retrieval has obvious important consequences, both economic and otherwise. Traditionally, data base designers have produced parsimonious logical data models. In spite of their increased size, ontologically clearer conceptual models have been shown to facilitate better performance for both problem solving and information retrieval tasks in experimental settings. The experiments producing evidence of enhanced performance for ontologically clearer models have, however, used application domains of modest size. Data models in organizational settings are likely to be substantially larger than those used in these experiments. This research used an experiment to investigate whether the benefits of improved information retrieval performance associated with ontologically clearer models are robust as the size of the application domains increase. The experiment used an application domain of approximately twice the size as tested in prior experiments. The results indicate that, relative to the users of the parsimonious implementation, end users of the ontologically clearer implementation made significantly more semantic errors, took significantly more time to compose their queries, and were significantly less confident in the accuracy of their queries.
Resumo:
New tools derived from advances in molecular biology have not been widely adopted in plant breeding because of the inability to connect information at gene level to the phenotype in a manner that is useful for selection. We explore whether a crop growth and development modelling framework can link phenotype complexity to underlying genetic systems in a way that strengthens molecular breeding strategies. We use gene-to-phenotype simulation studies on sorghum to consider the value to marker-assisted selection of intrinsically stable QTLs that might be generated by physiological dissection of complex traits. The consequences on grain yield of genetic variation in four key adaptive traits – phenology, osmotic adjustment, transpiration efficiency, and staygreen – were simulated for a diverse set of environments by placing the known extent of genetic variation in the context of the physiological determinants framework of a crop growth and development model. It was assumed that the three to five genes associated with each trait, had two alleles per locus acting in an additive manner. The effects on average simulated yield, generated by differing combinations of positive alleles for the traits incorporated, varied with environment type. The full matrix of simulated phenotypes, which consisted of 547 location-season combinations and 4235 genotypic expression states, was analysed for genetic and environmental effects. The analysis was conducted in stages with gradually increased understanding of gene-to-phenotype relationships, which would arise from physiological dissection and modelling. It was found that environmental characterisation and physiological knowledge helped to explain and unravel gene and environment context dependencies. We simulated a marker-assisted selection (MAS) breeding strategy based on the analyses of gene effects. When marker scores were allocated based on the contribution of gene effects to yield in a single environment, there was a wide divergence in rate of yield gain over all environments with breeding cycle depending on the environment chosen for the QTL analysis. It was suggested that knowledge resulting from trait physiology and modelling would overcome this dependency by identifying stable QTLs. The improved predictive power would increase the utility of the QTLs in MAS. Developing and implementing this gene-to-phenotype capability in crop improvement requires enhanced attention to phenotyping, ecophysiological modelling, and validation studies to test the stability of candidate QTLs.
Resumo:
Multiresolution (or multi-scale) techniques make it possible for Web-based GIS applications to access large dataset. The performance of such systems relies on data transmission over network and multiresolution query processing. In the literature the latter has received little research attention so far, and the existing methods are not capable of processing large dataset. In this paper, we aim to improve multiresolution query processing in an online environment. A cost model for such query is proposed first, followed by three strategies for its optimization. Significant theoretical improvement can be observed when comparing against available methods. Application of these strategies is also discussed, and similar performance enhancement can be expected if implemented in online GIS applications.