99 resultados para Incomplete relational database
Resumo:
There has been tremendous interest in watermarking multimedia content during the past two decades, mainly for proving ownership and detecting tamper. Digital fingerprinting, that deals with identifying malicious user(s), has also received significant attention. While extensive work has been carried out in watermarking of images, other multimedia objects still have enormous research potential. Watermarking database relations is one of the several areas which demand research focus owing to the commercial implications of database theft. Recently, there has been little progress in database watermarking, with most of the watermarking schemes modeled after the irreversible database watermarking scheme proposed by Agrawal and Kiernan. Reversibility is the ability to re-generate the original (unmarked) relation from the watermarked relation using a secret key. As explained in our paper, reversible watermarking schemes provide greater security against secondary watermarking attacks, where an attacker watermarks an already marked relation in an attempt to erase the original watermark. This paper proposes an improvement over the reversible and blind watermarking scheme presented in [5], identifying and eliminating a critical problem with the previous model. Experiments showing that the average watermark detection rate is around 91% even with attacker distorting half of the attributes. The current scheme provides security against secondary watermarking attacks.
Resumo:
Database watermarking has received significant research attention in the current decade. Although, almost all watermarking models have been either irreversible (the original relation cannot be restored from the watermarked relation) and/or non-blind (requiring original relation to detect the watermark in watermarked relation). This model has several disadvantages over reversible and blind watermarking (requiring only watermarked relation and secret key from which the watermark is detected and original relation is restored) including inability to identify rightful owner in case of successful secondary watermarking, inability to revert the relation to original data set (required in high precision industries) and requirement to store unmarked relation at a secure secondary storage. To overcome these problems, we propose a watermarking scheme that is reversible as well as blind. We utilize difference expansion on integers to achieve reversibility. The major advantages provided by our scheme are reversibility to high quality original data set, rightful owner identification, resistance against secondary watermarking attacks, and no need to store original database at a secure secondary storage.
Resumo:
Background Kiwifruit (Actinidia spp.) are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs). Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha) and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons). Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases) and pathways (terpenoid biosynthesis) is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia.
Resumo:
Person re-identification is particularly challenging due to significant appearance changes across separate camera views. In order to re-identify people, a representative human signature should effectively handle differences in illumination, pose and camera parameters. While general appearance-based methods are modelled in Euclidean spaces, it has been argued that some applications in image and video analysis are better modelled via non-Euclidean manifold geometry. To this end, recent approaches represent images as covariance matrices, and interpret such matrices as points on Riemannian manifolds. As direct classification on such manifolds can be difficult, in this paper we propose to represent each manifold point as a vector of similarities to class representers, via a recently introduced form of Bregman matrix divergence known as the Stein divergence. This is followed by using a discriminative mapping of similarity vectors for final classification. The use of similarity vectors is in contrast to the traditional approach of embedding manifolds into tangent spaces, which can suffer from representing the manifold structure inaccurately. Comparative evaluations on benchmark ETHZ and iLIDS datasets for the person re-identification task show that the proposed approach obtains better performance than recent techniques such as Histogram Plus Epitome, Partial Least Squares, and Symmetry-Driven Accumulation of Local Features.
Resumo:
The location of previously unseen and unregistered individuals in complex camera networks from semantic descriptions is a time consuming and often inaccurate process carried out by human operators, or security staff on the ground. To promote the development and evaluation of automated semantic description based localisation systems, we present a new, publicly available, unconstrained 110 sequence database, collected from 6 stationary cameras. Each sequence contains detailed semantic information for a single search subject who appears in the clip (gender, age, height, build, hair and skin colour, clothing type, texture and colour), and between 21 and 290 frames for each clip are annotated with the target subject location (over 11,000 frames are annotated in total). A novel approach for localising a person given a semantic query is also proposed and demonstrated on this database. The proposed approach incorporates clothing colour and type (for clothing worn below the waist), as well as height and build to detect people. A method to assess the quality of candidate regions, as well as a symmetry driven approach to aid in modelling clothing on the lower half of the body, is proposed within this approach. An evaluation on the proposed dataset shows that a relative improvement in localisation accuracy of up to 21 is achieved over the baseline technique.
Resumo:
Motivated by the analysis of the Australian Grain Insect Resistance Database (AGIRD), we develop a Bayesian hurdle modelling approach to assess trends in strong resistance of stored grain insects to phosphine over time. The binary response variable from AGIRD indicating presence or absence of strong resistance is characterized by a majority of absence observations and the hurdle model is a two step approach that is useful when analyzing such a binary response dataset. The proposed hurdle model utilizes Bayesian classification trees to firstly identify covariates and covariate levels pertaining to possible presence or absence of strong resistance. Secondly, generalized additive models (GAMs) with spike and slab priors for variable selection are fitted to the subset of the dataset identified from the Bayesian classification tree indicating possibility of presence of strong resistance. From the GAM we assess trends, biosecurity issues and site specific variables influencing the presence of strong resistance using a variable selection approach. The proposed Bayesian hurdle model is compared to its frequentist counterpart, and also to a naive Bayesian approach which fits a GAM to the entire dataset. The Bayesian hurdle model has the benefit of providing a set of good trees for use in the first step and appears to provide enough flexibility to represent the influence of variables on strong resistance compared to the frequentist model, but also captures the subtle changes in the trend that are missed by the frequentist and naive Bayesian models.
Resumo:
During the early design stages of construction projects, accurate and timely cost feedback is critical to design decision making. This is particularly challenging for cost estimators, as they must quickly and accurately estimate the cost of the building when the design is still incomplete and evolving. State-of-the-art software tools typically use a rule-based approach to generate detailed quantities from the design details present in a building model and relate them to the cost items in a cost estimating database. In this paper, we propose a generic approach for creating and maintaining a cost estimate using flexible mappings between a building model and a cost estimate. The approach uses queries on the building design that are used to populate views, and each view is then associated with one or more cost items. The benefit of this approach is that the flexibility of modern query languages allows the estimator to encode a broad variety of relationships between the design and estimate. It also avoids the use of a common standard to which both designers and estimators must conform, allowing the estimator added flexibility and functionality to their work.
Resumo:
Protein adsorption at solid-liquid interfaces is critical to many applications, including biomaterials, protein microarrays and lab-on-a-chip devices. Despite this general interest, and a large amount of research in the last half a century, protein adsorption cannot be predicted with an engineering level, design-orientated accuracy. Here we describe a Biomolecular Adsorption Database (BAD), freely available online, which archives the published protein adsorption data. Piecewise linear regression with breakpoint applied to the data in the BAD suggests that the input variables to protein adsorption, i.e., protein concentration in solution; protein descriptors derived from primary structure (number of residues, global protein hydrophobicity and range of amino acid hydrophobicity, isoelectric point); surface descriptors (contact angle); and fluid environment descriptors (pH, ionic strength), correlate well with the output variable-the protein concentration on the surface. Furthermore, neural network analysis revealed that the size of the BAD makes it sufficiently representative, with a neural network-based predictive error of 5% or less. Interestingly, a consistently better fit is obtained if the BAD is divided in two separate sub-sets representing protein adsorption on hydrophilic and hydrophobic surfaces, respectively. Based on these findings, selected entries from the BAD have been used to construct neural network-based estimation routines, which predict the amount of adsorbed protein, the thickness of the adsorbed layer and the surface tension of the protein-covered surface. While the BAD is of general interest, the prediction of the thickness and the surface tension of the protein-covered layers are of particular relevance to the design of microfluidics devices.
Resumo:
Public buildings and large infrastructure are typically monitored by tens or hundreds of cameras, all capturing different physical spaces and observing different types of interactions and behaviours. However to date, in large part due to limited data availability, crowd monitoring and operational surveillance research has focused on single camera scenarios which are not representative of real-world applications. In this paper we present a new, publicly available database for large scale crowd surveillance. Footage from 12 cameras for a full work day covering the main floor of a busy university campus building, including an internal and external foyer, elevator foyers, and the main external approach are provided; alongside annotation for crowd counting (single or multi-camera) and pedestrian flow analysis for 10 and 6 sites respectively. We describe how this large dataset can be used to perform distributed monitoring of building utilisation, and demonstrate the potential of this dataset to understand and learn the relationship between different areas of a building.
Resumo:
Visual information in the form of lip movements of the speaker has been shown to improve the performance of speech recognition and search applications. In our previous work, we proposed cross database training of synchronous hidden Markov models (SHMMs) to make use of external large and publicly available audio databases in addition to the relatively small given audio visual database. In this work, the cross database training approach is improved by performing an additional audio adaptation step, which enables audio visual SHMMs to benefit from audio observations of the external audio models before adding visual modality to them. The proposed approach outperforms the baseline cross database training approach in clean and noisy environments in terms of phone recognition accuracy as well as spoken term detection (STD) accuracy.
Resumo:
Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.
Resumo:
This article uses topological approaches to suggest that education is becoming-topological. Analyses presented in a recent double-issue of Theory, Culture & Society are used to demonstrate the utility of topology for education. In particular, the article explains education's topological character through examining the global convergence of education policy, testing and the discursive ranking of systems, schools and individuals in the promise of reforming education through the proliferation of regimes of testing at local and global levels that constitute a new form of governance through data. In this conceptualisation of global education policy changes in the form and nature of testing combine with it the emergence of global policy network to change the nature of the local (national, regional, school and classroom) forces that operate through the ‘system’. While these forces change, they work through a discursivity that produces disciplinary effects, but in a different way. This new–old disciplinarity, or ‘database effect’, is here represented through a topological approach because of its utility for conceiving education in an increasingly networked world.