7 resultados para query

em Massachusetts Institute of Technology


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes a natural language system START. The system analyzes English text and automatically transforms it into an appropriate representation, the knowledge base, which incorporates the information found in the text. The user gains access to information stored in the knowledge base by querying it in English. The system analyzes the query and decides through a matching process what information in the knowledge base is relevant to the question. Then it retrieves this information and formulates its response also in English.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We consider the question "How should one act when the only goal is to learn as much as possible?" Building on the theoretical results of Fedorov [1972] and MacKay [1992], we apply techniques from Optimal Experiment Design (OED) to guide the query/action selection of a neural network learner. We demonstrate that these techniques allow the learner to minimize its generalization error by exploring its domain efficiently and completely. We conclude that, while not a panacea, OED-based query/action has much to offer, especially in domains where its high computational costs can be tolerated.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The task in text retrieval is to find the subset of a collection of documents relevant to a user's information request, usually expressed as a set of words. Classically, documents and queries are represented as vectors of word counts. In its simplest form, relevance is defined to be the dot product between a document and a query vector--a measure of the number of common terms. A central difficulty in text retrieval is that the presence or absence of a word is not sufficient to determine relevance to a query. Linear dimensionality reduction has been proposed as a technique for extracting underlying structure from the document collection. In some domains (such as vision) dimensionality reduction reduces computational complexity. In text retrieval it is more often used to improve retrieval performance. We propose an alternative and novel technique that produces sparse representations constructed from sets of highly-related words. Documents and queries are represented by their distance to these sets. and relevance is measured by the number of common clusters. This technique significantly improves retrieval performance, is efficient to compute and shares properties with the optimal linear projection operator and the independent components of documents.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this thesis I present a language for instructing a sheet of identically-programmed, flexible, autonomous agents (``cells'') to assemble themselves into a predetermined global shape, using local interactions. The global shape is described as a folding construction on a continuous sheet, using a set of axioms from paper-folding (origami). I provide a means of automatically deriving the cell program, executed by all cells, from the global shape description. With this language, a wide variety of global shapes and patterns can be synthesized, using only local interactions between identically-programmed cells. Examples include flat layered shapes, all plane Euclidean constructions, and a variety of tessellation patterns. In contrast to approaches based on cellular automata or evolution, the cell program is directly derived from the global shape description and is composed from a small number of biologically-inspired primitives: gradients, neighborhood query, polarity inversion, cell-to-cell contact and flexible folding. The cell programs are robust, without relying on regular cell placement, global coordinates, or synchronous operation and can tolerate a small amount of random cell death. I show that an average cell neighborhood of 15 is sufficient to reliably self-assemble complex shapes and geometric patterns on randomly distributed cells. The language provides many insights into the relationship between local and global descriptions of behavior, such as the advantage of constructive languages, mechanisms for achieving global robustness, and mechanisms for achieving scale-independent shapes from a single cell program. The language suggests a mechanism by which many related shapes can be created by the same cell program, in the manner of D'Arcy Thompson's famous coordinate transformations. The thesis illuminates how complex morphology and pattern can emerge from local interactions, and how one can engineer robust self-assembly.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we develop a novel index structure to support efficient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance the BID mechanism with clustering, cluster adapted bitcoder and dimensional weight, named the BID⁺. Extensive experiments are conducted to show that our proposed method yields significant performance advantages over the existing index structures on both real life and synthetic high-dimensional datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we present a P2P-based database sharing system that provides information sharing capabilities through keyword-based search techniques. Our system requires neither a global schema nor schema mappings between different databases, and our keyword-based search algorithms are robust in the presence of frequent changes in the content and membership of peers. To facilitate data integration, we introduce keyword join operator to combine partial answers containing different keywords into complete answers. We also present an efficient algorithm that optimize the keyword join operations for partial answer integration. Our experimental study on both real and synthetic datasets demonstrates the effectiveness of our algorithms, and the efficiency of the proposed query processing strategies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a technique for the rapid and reliable evaluation of linear-functional output of elliptic partial differential equations with affine parameter dependence. The essential components are (i) rapidly uniformly convergent reduced-basis approximations — Galerkin projection onto a space WN spanned by solutions of the governing partial differential equation at N (optimally) selected points in parameter space; (ii) a posteriori error estimation — relaxations of the residual equation that provide inexpensive yet sharp and rigorous bounds for the error in the outputs; and (iii) offline/online computational procedures — stratagems that exploit affine parameter dependence to de-couple the generation and projection stages of the approximation process. The operation count for the online stage — in which, given a new parameter value, we calculate the output and associated error bound — depends only on N (typically small) and the parametric complexity of the problem. The method is thus ideally suited to the many-query and real-time contexts. In this paper, based on the technique we develop a robust inverse computational method for very fast solution of inverse problems characterized by parametrized partial differential equations. The essential ideas are in three-fold: first, we apply the technique to the forward problem for the rapid certified evaluation of PDE input-output relations and associated rigorous error bounds; second, we incorporate the reduced-basis approximation and error bounds into the inverse problem formulation; and third, rather than regularize the goodness-of-fit objective, we may instead identify all (or almost all, in the probabilistic sense) system configurations consistent with the available experimental data — well-posedness is reflected in a bounded "possibility region" that furthermore shrinks as the experimental error is decreased.