93 resultados para quantum search


Relevância:

20.00% 20.00%

Publicador:

Resumo:

XML documents are becoming more and more common in various environments. In particular, enterprise-scale document management is commonly centred around XML, and desktop applications as well as online document collections are soon to follow. The growing number of XML documents increases the importance of appropriate indexing methods and search tools in keeping the information accessible. Therefore, we focus on content that is stored in XML format as we develop such indexing methods. Because XML is used for different kinds of content ranging all the way from records of data fields to narrative full-texts, the methods for Information Retrieval are facing a new challenge in identifying which content is subject to data queries and which should be indexed for full-text search. In response to this challenge, we analyse the relation of character content and XML tags in XML documents in order to separate the full-text from data. As a result, we are able to both reduce the size of the index by 5-6\% and improve the retrieval precision as we select the XML fragments to be indexed. Besides being challenging, XML comes with many unexplored opportunities which are not paid much attention in the literature. For example, authors often tag the content they want to emphasise by using a typeface that stands out. The tagged content constitutes phrases that are descriptive of the content and useful for full-text search. They are simple to detect in XML documents, but also possible to confuse with other inline-level text. Nonetheless, the search results seem to improve when the detected phrases are given additional weight in the index. Similar improvements are reported when related content is associated with the indexed full-text including titles, captions, and references. Experimental results show that for certain types of document collections, at least, the proposed methods help us find the relevant answers. Even when we know nothing about the document structure but the XML syntax, we are able to take advantage of the XML structure when the content is indexed for full-text search.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Analyzing statistical dependencies is a fundamental problem in all empirical science. Dependencies help us understand causes and effects, create new scientific theories, and invent cures to problems. Nowadays, large amounts of data is available, but efficient computational tools for analyzing the data are missing. In this research, we develop efficient algorithms for a commonly occurring search problem - searching for the statistically most significant dependency rules in binary data. We consider dependency rules of the form X->A or X->not A, where X is a set of positive-valued attributes and A is a single attribute. Such rules describe which factors either increase or decrease the probability of the consequent A. A classical example are genetic and environmental factors, which can either cause or prevent a disease. The emphasis in this research is that the discovered dependencies should be genuine - i.e. they should also hold in future data. This is an important distinction from the traditional association rules, which - in spite of their name and a similar appearance to dependency rules - do not necessarily represent statistical dependencies at all or represent only spurious connections, which occur by chance. Therefore, the principal objective is to search for the rules with statistical significance measures. Another important objective is to search for only non-redundant rules, which express the real causes of dependence, without any occasional extra factors. The extra factors do not add any new information on the dependence, but can only blur it and make it less accurate in future data. The problem is computationally very demanding, because the number of all possible rules increases exponentially with the number of attributes. In addition, neither the statistical dependency nor the statistical significance are monotonic properties, which means that the traditional pruning techniques do not work. As a solution, we first derive the mathematical basis for pruning the search space with any well-behaving statistical significance measures. The mathematical theory is complemented by a new algorithmic invention, which enables an efficient search without any heuristic restrictions. The resulting algorithm can be used to search for both positive and negative dependencies with any commonly used statistical measures, like Fisher's exact test, the chi-squared measure, mutual information, and z scores. According to our experiments, the algorithm is well-scalable, especially with Fisher's exact test. It can easily handle even the densest data sets with 10000-20000 attributes. Still, the results are globally optimal, which is a remarkable improvement over the existing solutions. In practice, this means that the user does not have to worry whether the dependencies hold in future data or if the data still contains better, but undiscovered dependencies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current smartphones have a storage capacity of several gigabytes. More and more information is stored on mobile devices. To meet the challenge of information organization, we turn to desktop search. Users often possess multiple devices, and synchronize (subsets of) information between them. This makes file synchronization more important. This thesis presents Dessy, a desktop search and synchronization framework for mobile devices. Dessy uses desktop search techniques, such as indexing, query and index term stemming, and search relevance ranking. Dessy finds files by their content, metadata, and context information. For example, PDF files may be found by their author, subject, title, or text. EXIF data of JPEG files may be used in finding them. User–defined tags can be added to files to organize and retrieve them later. Retrieved files are ranked according to their relevance to the search query. The Dessy prototype uses the BM25 ranking function, used widely in information retrieval. Dessy provides an interface for locating files for both users and applications. Dessy is closely integrated with the Syxaw file synchronizer, which provides efficient file and metadata synchronization, optimizing network usage. Dessy supports synchronization of search results, individual files, and directory trees. It allows finding and synchronizing files that reside on remote computers, or the Internet. Dessy is designed to solve the problem of efficient mobile desktop search and synchronization, also supporting remote and Internet search. Remote searches may be carried out offline using a downloaded index, or while connected to the remote machine on a weak network. To secure user data, transmissions between the Dessy client and server are encrypted using symmetric encryption. Symmetric encryption keys are exchanged with RSA key exchange. Dessy emphasizes extensibility. Also the cryptography can be extended. Users may tag their files with context tags and control custom file metadata. Adding new indexed file types, metadata fields, ranking methods, and index types is easy. Finding files is done with virtual directories, which are views into the user’s files, browseable by regular file managers. On mobile devices, the Dessy GUI provides easy access to the search and synchronization system. This thesis includes results of Dessy synchronization and search experiments, including power usage measurements. Finally, Dessy has been designed with mobility and device constraints in mind. It requires only MIDP 2.0 Mobile Java with FileConnection support, and Java 1.5 on desktop machines.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

By detecting leading protons produced in the Central Exclusive Diffractive process, p+p → p+X+p, one can measure the missing mass, and scan for possible new particle states such as the Higgs boson. This process augments - in a model independent way - the standard methods for new particle searches at the Large Hadron Collider (LHC) and will allow detailed analyses of the produced central system, such as the spin-parity properties of the Higgs boson. The exclusive central diffractive process makes possible precision studies of gluons at the LHC and complements the physics scenarios foreseen at the next e+e− linear collider. This thesis first presents the conclusions of the first systematic analysis of the expected precision measurement of the leading proton momentum and the accuracy of the reconstructed missing mass. In this initial analysis, the scattered protons are tracked along the LHC beam line and the uncertainties expected in beam transport and detection of the scattered leading protons are accounted for. The main focus of the thesis is in developing the necessary radiation hard precision detector technology for coping with the extremely demanding experimental environment of the LHC. This will be achieved by using a 3D silicon detector design, which in addition to the radiation hardness of up to 5×10^15 neutrons/cm2, offers properties such as a high signal-to- noise ratio, fast signal response to radiation and sensitivity close to the very edge of the detector. This work reports on the development of a novel semi-3D detector design that simplifies the 3D fabrication process, but conserves the necessary properties of the 3D detector design required in the LHC and in other imaging applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Arguments arising from quantum mechanics and gravitation theory as well as from string theory, indicate that the description of space-time as a continuous manifold is not adequate at very short distances. An important candidate for the description of space-time at such scales is provided by noncommutative space-time where the coordinates are promoted to noncommuting operators. Thus, the study of quantum field theory in noncommutative space-time provides an interesting interface where ordinary field theoretic tools can be used to study the properties of quantum spacetime. The three original publications in this thesis encompass various aspects in the still developing area of noncommutative quantum field theory, ranging from fundamental concepts to model building. One of the key features of noncommutative space-time is the apparent loss of Lorentz invariance that has been addressed in different ways in the literature. One recently developed approach is to eliminate the Lorentz violating effects by integrating over the parameter of noncommutativity. Fundamental properties of such theories are investigated in this thesis. Another issue addressed is model building, which is difficult in the noncommutative setting due to severe restrictions on the possible gauge symmetries imposed by the noncommutativity of the space-time. Possible ways to relieve these restrictions are investigated and applied and a noncommutative version of the Minimal Supersymmetric Standard Model is presented. While putting the results obtained in the three original publications into their proper context, the introductory part of this thesis aims to provide an overview of the present situation in the field.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nucleation is the first step of the process by which gas molecules in the atmosphere condense to form liquid or solid particles. Despite the importance of atmospheric new-particle formation for both climate and health-related issues, little information exists on its precise molecular-level mechanisms. In this thesis, potential nucleation mechanisms involving sulfuric acid together with either water and ammonia or reactive biogenic molecules are studied using quantum chemical methods. Quantum chemistry calculations are based on the numerical solution of Schrödinger's equation for a system of atoms and electrons subject to various sets of approximations, the precise details of which give rise to a large number of model chemistries. A comparison of several different model chemistries indicates that the computational method must be chosen with care if accurate results for sulfuric acid - water - ammonia clusters are desired. Specifically, binding energies are incorrectly predicted by some popular density functionals, and vibrational anharmonicity must be accounted for if quantitatively reliable formation free energies are desired. The calculations reported in this thesis show that a combination of different high-level energy corrections and advanced thermochemical analysis can quantitatively replicate experimental results concerning the hydration of sulfuric acid. The role of ammonia in sulfuric acid - water nucleation was revealed by a series of calculations on molecular clusters of increasing size with respect to all three co-ordinates; sulfuric acid, water and ammonia. As indicated by experimental measurements, ammonia significantly assists the growth of clusters in the sulfuric acid - co-ordinate. The calculations presented in this thesis predict that in atmospheric conditions, this effect becomes important as the number of acid molecules increases from two to three. On the other hand, small molecular clusters are unlikely to contain more than one ammonia molecule per sulfuric acid. This implies that the average NH3:H2SO4 mole ratio of small molecular clusters in atmospheric conditions is likely to be between 1:3 and 1:1. Calculations on charged clusters confirm the experimental result that the HSO4- ion is much more strongly hydrated than neutral sulfuric acid. Preliminary calculations on HSO4- NH3 clusters indicate that ammonia is likely to play at most a minor role in ion-induced nucleation in the sulfuric acid - water system. Calculations of thermodynamic and kinetic parameters for the reaction of stabilized Criegee Intermediates with sulfuric acid demonstrate that quantum chemistry is a powerful tool for investigating chemically complicated nucleation mechanisms. The calculations indicate that if the biogenic Criegee Intermediates have sufficiently long lifetimes in atmospheric conditions, the studied reaction may be an important source of nucleation precursors.