48 resultados para Incomplete relational database
Resumo:
A "plan diagram" is a pictorial enumeration of the execution plan choices of a database query optimizer over the relational selectivity space. We have shown recently that, for industrial-strength database engines, these diagrams are often remarkably complex and dense, with a large number of plans covering the space. However, they can often be reduced to much simpler pictures, featuring significantly fewer plans, without materially affecting the query processing quality. Plan reduction has useful implications for the design and usage of query optimizers, including quantifying redundancy in the plan search space, enhancing useability of parametric query optimization, identifying error-resistant and least-expected-cost plans, and minimizing the overheads of multi-plan approaches. We investigate here the plan reduction issue from theoretical, statistical and empirical perspectives. Our analysis shows that optimal plan reduction, w.r.t. minimizing the number of plans, is an NP-hard problem in general, and remains so even for a storage-constrained variant. We then present a greedy reduction algorithm with tight and optimal performance guarantees, whose complexity scales linearly with the number of plans in the diagram for a given resolution. Next, we devise fast estimators for locating the best tradeoff between the reduction in plan cardinality and the impact on query processing quality. Finally, extensive experimentation with a suite of multi-dimensional TPCH-based query templates on industrial-strength optimizers demonstrates that complex plan diagrams easily reduce to "anorexic" (small absolute number of plans) levels incurring only marginal increases in the estimated query processing costs.
Resumo:
When hosting XML information on relational backends, a mapping has to be established between the schemas of the information source and the target storage repositories. A rich body of recent literature exists for mapping isolated components of XML Schema to their relational counterparts, especially with regard to table configurations. In this paper, we present the Elixir system for designing industrial-strength mappings for real-world applications. Specifically, it produces an information-preserving holistic mapping that transforms the complete XML world-view (XML schema with constraints, XML documents XQuery queries including triggers and views) into a full-scale relational mapping (table definitions, integrity constraints, indices, triggers and views) that is tuned to the application workload. A key design feature of Elixir is that it performs all its mapping-related optimizations in the XML source space, rather than in the relational target space. Further, unlike the XML mapping tools of commercial database systems, which rely heavily on user inputs, Elixir takes a principled cost-based approach to automatically find an efficient relational mapping. A prototype of Elixir is operational and we quantitatively demonstrate its functionality and efficacy on a variety of real-life XML schemas.
Resumo:
The standard quantum search algorithm lacks a feature, enjoyed by many classical algorithms, of having a fixed-point, i.e. a monotonic convergence towards the solution. Here we present two variations of the quantum search algorithm, which get around this limitation. The first replaces selective inversions in the algorithm by selective phase shifts of $\frac{\pi}{3}$. The second controls the selective inversion operations using two ancilla qubits, and irreversible measurement operations on the ancilla qubits drive the starting state towards the target state. Using $q$ oracle queries, these variations reduce the probability of finding a non-target state from $\epsilon$ to $\epsilon^{2q+1}$, which is asymptotically optimal. Similar ideas can lead to robust quantum algorithms, and provide conceptually new schemes for error correction.
Resumo:
This paper describes the efforts at MILE lab, IISc, to create a 100,000-word database each in Kannada and Tamil for the design and development of Online Handwritten Recognition. It has been collected from over 600 users in order to capture the variations in writing style. We describe features of the scripts and how the number of symbols were reduced to be able to effectively train the data for recognition. The list of words include all the characters, Kannada and Indo-Arabic numerals, punctuations and other symbols. A semi-automated tool for the annotation of data from stroke to word level is used. It segments each word into stroke groups and also acts as a validation mechanism for segmentation. The tool displays the stroke, stroke groups and aksharas of a word and hence can be used to study the various styles of writing, delayed strokes and for assigning quality tags to the words. The tool is currently being used for annotating Tamil and Kannada data. The output is stored in a standard XML format.
Resumo:
This paper presents the preliminary analysis of Kannada WordNet and the set of relevant computational tools. Although the design has been inspired by the famous English WordNet, and to certain extent, by the Hindi WordNet, the unique features of Kannada WordNet are graded antonyms and meronymy relationships, nominal as well as verbal compoundings, complex verb constructions and efficient underlying database design (designed to handle storage and display of Kannada unicode characters). Kannada WordNet would not only add to the sparse collection of machine-readable Kannada dictionaries, but also will give new insights into the Kannada vocabulary. It provides sufficient interface for applications involved in Kannada machine translation, spell checker and semantic analyser.
Resumo:
This paper presents an enhanced relational description for the prescription of the grasp requirement and evolution of the posture of a digital human hand towards satisfaction of this requirement. Precise relational description needs anatomical segmentation of the hand geometry into palmar, dorsal and lateral patches using the palm-plane and joint locations information, and operational segmentation of the object geometry into pull,push and lateral patches with due consideration to the effect of friction. Relational description identifies appropriate patches for a desired grasp condition. Satisfaction of this requirement occurs in two discrete stages,namely,contact establishment and post-contact force exertion for object capturing. Contact establishment occurs in four potentially overlapping phases,namely,re-orientation,transfer,pre- shaping,and closing-in. The novel h and re-orientation phase,enables the palm to face the object in a task sequence scenario, transfer takes the wrist to the ball park ; pre-shaping and close-in finally achieves the contact. In this paper, an anatomically pertinent closed-form formulation is presented for the closing-in phase for identification of the point of contact on the patches ,prescribed by the relational description. Since mere contact does not ensure grasp and slip phenomenon at the point of contact on application of force is a common occurrence, the effect of slip in presence of friction has been studied for 2D and 3D object grasping endeavours and a computational generation of the slip locus is presented.A general slip locus is found to be a non-linear curve even on planar faces.Two varieties of slip phenomena,namely,stabilizing and non-stabilizing slips, and their local characteristics have been identified.Study of the evolution of this slip characteristic over the slip locus exhibited diverse grasping behaviour possibilities. Thus, the relational description paradigm not only makes the requirement specification easy and meaningful but also enables high fidelity hand object interaction studies possible.
Resumo:
For necessary goods like water, under supply constraints, fairness considerations lead to negative externalities. The objective of this paper is to design an infinite horizon contract or relational contract (a type of long-term contract) that ensures self-enforcing (instead of court-enforced) behaviour by the agents to mitigate the externality due to fairness issues. In this contract, the consumer is induced to consume at firm-supply level using the threat of higher fair price for future time periods. The pricing mechanism, computed in this paper, internalizes the externality and is shown to be economically efficient and provides revenue sufficiency.
Resumo:
Protein structure alignment is a crucial step in protein structure-function analysis. Despite the advances in protein structure alignment algorithms, some of the local conformationally similar regions are mislabeled as structurally variable regions (SVRs). These regions are not well superimposed because of differences in their spatial orientations. The Database of Structural Alignments (DoSA) addresses this gap in identification of local structural similarities obscured in global protein structural alignments by realigning SVRs using an algorithm based on protein blocks. A set of protein blocks is a structural alphabet that abstracts protein structures into 16 unique local structural motifs. DoSA provides unique information about 159 780 conformationally similar and 56 140 conformationally dissimilar SVRs in 74 705 pairwise structural alignments of homologous proteins. The information provided on conformationally similar and dissimilar SVRs can be helpful to model loop regions. It is also conceivable that conformationally similar SVRs with conserved residues could potentially contribute toward functional integrity of homologues, and hence identifying such SVRs could be helpful in understanding the structural basis of protein function.
Resumo:
The rather low scattering or extinction efficiency of small nanoparticles, metallic and otherwise, is significantly enhanced when they are adsorbed on a larger core particle. But the photoabsorption by particles with varying surface area fractions on a larger core particle is found to be limited by saturation. It is found that the core-shell particle can have a lower absorption efficiency than a dielectric core with its surface partially nucleated with absorbing particles-an ``incomplete nanoshell'' particle. We have both numerically and experimentally studied the optical efficiencies of titania (TiO2) nucleated in various degrees on silica (SiO2) nanospheres. We show that optimal surface nucleation over cores of appropriate sizes and optical properties will have a direct impact on the applications exploiting the absorption and scattering properties of such composite particles.
Resumo:
We study the problem of analyzing influence of various factors affecting individual messages posted in social media. The problem is challenging because of various types of influences propagating through the social media network that act simultaneously on any user. Additionally, the topic composition of the influencing factors and the susceptibility of users to these influences evolve over time. This problem has not been studied before, and off-the-shelf models are unsuitable for this purpose. To capture the complex interplay of these various factors, we propose a new non-parametric model called the Dynamic Multi-Relational Chinese Restaurant Process. This accounts for the user network for data generation and also allows the parameters to evolve over time. Designing inference algorithms for this model suited for large scale social-media data is another challenge. To this end, we propose a scalable and multi-threaded inference algorithm based on online Gibbs Sampling. Extensive evaluations on large-scale Twitter and Face book data show that the extracted topics when applied to authorship and commenting prediction outperform state-of-the-art baselines. More importantly, our model produces valuable insights on topic trends and user personality trends beyond the capability of existing approaches.
Resumo:
USC-TIMIT is an extensive database of multimodal speech production data, developed to complement existing resources available to the speech research community and with the intention of being continuously refined and augmented. The database currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English. Electromagnetic articulography data have also been presently collected from four of these speakers. The two modalities were recorded in two independent sessions while the subjects produced the same 460 sentence corpus used previously in the MOCHA-TIMIT database. In both cases the audio signal was recorded and synchronized with the articulatory data. The database and companion software are freely available to the research community. (C) 2014 Acoustical Society of America.
Resumo:
Background: Haemophilus influenzae (H. Influenzae) is the causative agent of pneumonia, bacteraemia and meningitis. The organism is responsible for large number of deaths in both developed and developing countries. Even-though the first bacterial genome to be sequenced was that of H. Influenzae, there is no exclusive database dedicated for H. Influenzae. This prompted us to develop the Haemophilus influenzae Genome Database (HIGDB). Methods: All data of HIGDB are stored and managed in MySQL database. The HIGDB is hosted on Solaris server and developed using PERL modules. Ajax and JavaScript are used for the interface development. Results: The HIGDB contains detailed information on 42,741 proteins, 18,077 genes including 10 whole genome sequences and also 284 three dimensional structures of proteins of H. influenzae. In addition, the database provides ``Motif search'' and ``GBrowse''. The HIGDB is freely accessible through the URL:http://bioserverl.physicslisc.ernetin/HIGDB/. Discussion: The HIGDB will be a single point access for bacteriological, clinical, genomic and proteomic information of H. influenzae. The database can also be used to identify DNA motifs within H. influenzae genomes and to compare gene or protein sequences of a particular strain with other strains of H. influenzae. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
Streptococcus pneumoniae causes pneumonia, septicemia and meningitis. S. pneumoniae is responsible for significant mortality both in children and in the elderly. In recent years, the whole genome sequencing of various S. pneumoniae strains have increased manifold and there is an urgent need to provide organism specific annotations to the scientific community. This prompted us to develop the Streptococcus pneumoniae Genome Database (SPGDB) to integrate and analyze the completely sequenced and available S. pneumoniae genome sequences. Further, links to several tools are provided to compare the pool of gene and protein sequences, and proteins structure across different strains of S. pneumoniae. SPGDB aids in the analysis of phenotypic variations as well as to perform extensive genomics and evolutionary studies with reference to S. pneumoniae. (C) 2014 Elsevier Inc. All rights reserved.