969 resultados para Databases - Duplicate tuples


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The principles of organization of the distributed system of databases on properties of inorganic substances and materials based on the use of a special reference database are considered. The last includes not only information on a site of the data about the certain substance in other databases but also brief information on the most widespread properties of inorganic substances. The proposed principles were successfully realized at the creation of the distributed system of databases on properties of inorganic compounds developed by A.A.Baikov Institute of Metallurgy and Materials Science of the Russian Academy of Sciences.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Visual information is becoming increasingly important and tools to manage repositories of media collections are highly sought after. In this paper, we focus on image databases and on how to effectively and efficiently access these. In particular, we present effective image browsing systems that are operated on a large multi-touch environment for truly interactive exploration. Not only do image browsers pose a useful alternative to retrieval-based systems, they also provide a visualisation of the whole image collection and let users explore particular parts of the collection. Our systems are based on the idea that visually similar images are located close to each other in the visualisation, that image thumbnails are arranged on a regular lattice (either a regular grid projected on a sphere or a hexagonal lattice), and that large image datasets can be accessed through a hierarchical tree structure. © 2014 International Information Institute.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper surveys research in the field of data mining, which is related to discovering the dependencies between attributes in databases. We consider a number of approaches to finding the distribution intervals of association rules, to discovering branching dependencies between a given set of attributes and a given attribute in a database relation, to finding fractional dependencies between a given set of attributes and a given attribute in a database relation, and to collaborative filtering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Image database visualisations, in particular mapping-based visualisations, provide an interesting approach to accessing image repositories as they are able to overcome some of the drawbacks associated with retrieval based approaches. However, making a mapping-based approach work efficiently on large remote image databases, has yet to be explored. In this paper, we present Web-Based Images Browser (WBIB), a novel system that efficiently employs image pyramids to reduce bandwidth requirements so that users can interactively explore large remote image databases. © 2013 Authors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this chapter we provide a comprehensive overview of the emerging field of visualising and browsing image databases. We start with a brief introduction to content-based image retrieval and the traditional query-by-example search paradigm that many retrieval systems employ. We specify the problems associated with this type of interface, such as users not being able to formulate a query due to not having a target image or concept in mind. The idea of browsing systems is then introduced as a means to combat these issues, harnessing the cognitive power of the human mind in order to speed up image retrieval.We detail common methods in which the often high-dimensional feature data extracted from images can be used to visualise image databases in an intuitive way. Systems using dimensionality reduction techniques, such as multi-dimensional scaling, are reviewed along with those that cluster images using either divisive or agglomerative techniques as well as graph-based visualisations. While visualisation of an image collection is useful for providing an overview of the contained images, it forms only part of an image database navigation system. We therefore also present various methods provided by these systems to allow for interactive browsing of these datasets. A further area we explore are user studies of systems and visualisations where we look at the different evaluations undertaken in order to test usability and compare systems, and highlight the key findings from these studies. We conclude the chapter with several recommendations for future work in this area. © 2011 Springer-Verlag Berlin Heidelberg.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Image collections are ever growing and hence efficient and effective tools to manage these repositories are highly sought after. In this paper, we present effective image browsing systems that are operated on a large multi-touch environment for truly interactive exploration. Not only do image browsers pose a useful alternative to retrieval-based systems, they also provide a visualisation of the whole image collection and allow users to interactively explore particular parts of the collection. Our systems are based on the idea that visually similar images are located close to each other in the visualisation, that image thumbnails are arranged on a regular lattice (either a regular grid projected onto a sphere or a hexagonal lattice), and that large image datasets can be accessed through a hierarchical tree structure. A pilot study has shown that the presented systems do indeed work well and are preferred compared to conventional image browsers. © 2011 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mediation techniques provide interoperability and support integrated query processing among heterogeneous databases. While such techniques help data sharing among different sources, they increase the risk for data security, such as violating access control rules. Successful protection of information by an effective access control mechanism is a basic requirement for interoperation among heterogeneous data sources. ^ This dissertation first identified the challenges in the mediation system in order to achieve both interoperability and security in the interconnected and collaborative computing environment, which includes: (1) context-awareness, (2) semantic heterogeneity, and (3) multiple security policy specification. Currently few existing approaches address all three security challenges in mediation system. This dissertation provides a modeling and architectural solution to the problem of mediation security that addresses the aforementioned security challenges. A context-aware flexible authorization framework was developed in the dissertation to deal with security challenges faced by mediation system. The authorization framework consists of two major tasks, specifying security policies and enforcing security policies. Firstly, the security policy specification provides a generic and extensible method to model the security policies with respect to the challenges posed by the mediation system. The security policies in this study are specified by 5-tuples followed by a series of authorization constraints, which are identified based on the relationship of the different security components in the mediation system. Two essential features of mediation systems, i. e., relationship among authorization components and interoperability among heterogeneous data sources, are the focus of this investigation. Secondly, this dissertation supports effective access control on mediation systems while providing uniform access for heterogeneous data sources. The dynamic security constraints are handled in the authorization phase instead of the authentication phase, thus the maintenance cost of security specification can be reduced compared with related solutions. ^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Graph-structured databases are widely prevalent, and the problem of effective search and retrieval from such graphs has been receiving much attention recently. For example, the Web can be naturally viewed as a graph. Likewise, a relational database can be viewed as a graph where tuples are modeled as vertices connected via foreign-key relationships. Keyword search querying has emerged as one of the most effective paradigms for information discovery, especially over HTML documents in the World Wide Web. One of the key advantages of keyword search querying is its simplicity—users do not have to learn a complex query language, and can issue queries without any prior knowledge about the structure of the underlying data. The purpose of this dissertation was to develop techniques for user-friendly, high quality and efficient searching of graph structured databases. Several ranked search methods on data graphs have been studied in the recent years. Given a top-k keyword search query on a graph and some ranking criteria, a keyword proximity search finds the top-k answers where each answer is a substructure of the graph containing all query keywords, which illustrates the relationship between the keyword present in the graph. We applied keyword proximity search on the web and the page graph of web documents to find top-k answers that satisfy user’s information need and increase user satisfaction. Another effective ranking mechanism applied on data graphs is the authority flow based ranking mechanism. Given a top- k keyword search query on a graph, an authority-flow based search finds the top-k answers where each answer is a node in the graph ranked according to its relevance and importance to the query. We developed techniques that improved the authority flow based search on data graphs by creating a framework to explain and reformulate them taking in to consideration user preferences and feedback. We also applied the proposed graph search techniques for Information Discovery over biological databases. Our algorithms were experimentally evaluated for performance and quality. The quality of our method was compared to current approaches by using user surveys.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modern geographical databases, which are at the core of geographic information systems (GIS), store a rich set of aspatial attributes in addition to geographic data. Typically, aspatial information comes in textual and numeric format. Retrieving information constrained on spatial and aspatial data from geodatabases provides GIS users the ability to perform more interesting spatial analyses, and for applications to support composite location-aware searches; for example, in a real estate database: “Find the nearest homes for sale to my current location that have backyard and whose prices are between $50,000 and $80,000”. Efficient processing of such queries require combined indexing strategies of multiple types of data. Existing spatial query engines commonly apply a two-filter approach (spatial filter followed by nonspatial filter, or viceversa), which can incur large performance overheads. On the other hand, more recently, the amount of geolocation data has grown rapidly in databases due in part to advances in geolocation technologies (e.g., GPS-enabled smartphones) that allow users to associate location data to objects or events. The latter poses potential data ingestion challenges of large data volumes for practical GIS databases. In this dissertation, we first show how indexing spatial data with R-trees (a typical data pre-processing task) can be scaled in MapReduce—a widely-adopted parallel programming model for data intensive problems. The evaluation of our algorithms in a Hadoop cluster showed close to linear scalability in building R-tree indexes. Subsequently, we develop efficient algorithms for processing spatial queries with aspatial conditions. Novel techniques for simultaneously indexing spatial with textual and numeric data are developed to that end. Experimental evaluations with real-world, large spatial datasets measured query response times within the sub-second range for most cases, and up to a few seconds for a small number of cases, which is reasonable for interactive applications. Overall, the previous results show that the MapReduce parallel model is suitable for indexing tasks in spatial databases, and the adequate combination of spatial and aspatial attribute indexes can attain acceptable response times for interactive spatial queries with constraints on aspatial data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Large read-only or read-write transactions with a large read set and a small write set constitute an important class of transactions used in such applications as data mining, data warehousing, statistical applications, and report generators. Such transactions are best supported with optimistic concurrency, because locking of large amounts of data for extended periods of time is not an acceptable solution. The abort rate in regular optimistic concurrency algorithms increases exponentially with the size of the transaction. The algorithm proposed in this dissertation solves this problem by using a new transaction scheduling technique that allows a large transaction to commit safely with significantly greater probability that can exceed several orders of magnitude versus regular optimistic concurrency algorithms. A performance simulation study and a formal proof of serializability and external consistency of the proposed algorithm are also presented.^ This dissertation also proposes a new query optimization technique (lazy queries). Lazy Queries is an adaptive query execution scheme which optimizes itself as the query runs. Lazy queries can be used to find an intersection of sub-queries in a very efficient way, which does not require full execution of large sub-queries nor does it require any statistical knowledge about the data.^ An efficient optimistic concurrency control algorithm used in a massively parallel B-tree with variable-length keys is introduced. B-trees with variable-length keys can be effectively used in a variety of database types. In particular, we show how such a B-tree was used in our implementation of a semantic object-oriented DBMS. The concurrency control algorithm uses semantically safe optimistic virtual "locks" that achieve very fine granularity in conflict detection. This algorithm ensures serializability and external consistency by using logical clocks and backward validation of transactional queries. A formal proof of correctness of the proposed algorithm is also presented. ^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current technology permits connecting local networks via high-bandwidth telephone lines. Central coordinator nodes may use Intelligent Networks to manage data flow over dialed data lines, e.g. ISDN, and to establish connections between LANs. This dissertation focuses on cost minimization and on establishing operational policies for query distribution over heterogeneous, geographically distributed databases. Based on our study of query distribution strategies, public network tariff policies, and database interface standards we propose methods for communication cost estimation, strategies for the reduction of bandwidth allocation, and guidelines for central to node communication protocols. Our conclusion is that dialed data lines offer a cost effective alternative for the implementation of distributed database query systems, and that existing commercial software may be adapted to support query processing in heterogeneous distributed database systems. ^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

During the summer of 2016, Duke University Libraries staff began a project to update the way that research databases are displayed on the library website. The new research databases page is a customized version of the default A-Z list that Springshare provides for its LibGuides content management system. Duke Libraries staff made adjustments to the content and interface of the page. In order to see how Duke users navigated the new interface, usability testing was conducted on August 9th, 2016.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.