896 resultados para semantic cache


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Topic detection and tracking (TDT) is an area of information retrieval research the focus of which revolves around news events. The problems TDT deals with relate to segmenting news text into cohesive stories, detecting something new, previously unreported, tracking the development of a previously reported event, and grouping together news that discuss the same event. The performance of the traditional information retrieval techniques based on full-text similarity has remained inadequate for online production systems. It has been difficult to make the distinction between same and similar events. In this work, we explore ways of representing and comparing news documents in order to detect new events and track their development. First, however, we put forward a conceptual analysis of the notions of topic and event. The purpose is to clarify the terminology and align it with the process of news-making and the tradition of story-telling. Second, we present a framework for document similarity that is based on semantic classes, i.e., groups of words with similar meaning. We adopt people, organizations, and locations as semantic classes in addition to general terms. As each semantic class can be assigned its own similarity measure, document similarity can make use of ontologies, e.g., geographical taxonomies. The documents are compared class-wise, and the outcome is a weighted combination of class-wise similarities. Third, we incorporate temporal information into document similarity. We formalize the natural language temporal expressions occurring in the text, and use them to anchor the rest of the terms onto the time-line. Upon comparing documents for event-based similarity, we look not only at matching terms, but also how near their anchors are on the time-line. Fourth, we experiment with an adaptive variant of the semantic class similarity system. The news reflect changes in the real world, and in order to keep up, the system has to change its behavior based on the contents of the news stream. We put forward two strategies for rebuilding the topic representations and report experiment results. We run experiments with three annotated TDT corpora. The use of semantic classes increased the effectiveness of topic tracking by 10-30\% depending on the experimental setup. The gain in spotting new events remained lower, around 3-4\%. The anchoring the text to a time-line based on the temporal expressions gave a further 10\% increase the effectiveness of topic tracking. The gains in detecting new events, again, remained smaller. The adaptive systems did not improve the tracking results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Natural history collections are an invaluable resource housing a wealth of knowledge with a long tradition of contributing to a wide range of fields such as taxonomy, quarantine, conservation and climate change. It is recognized however [Smith and Blagoderov 2012] that such physical collections are often heavily underutilized as a result of the practical issues of accessibility. The digitization of these collections is a step towards removing these access issues, but other hurdles must be addressed before we truly unlock the potential of this knowledge.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper shows that by using only symbolic language phrases, a mobile robot can purposefully navigate to specified rooms in previously unexplored environments. The robot intelligently organises a symbolic language description of the unseen environment and “imagines” a representative map, called the abstract map. The abstract map is an internal representation of the topological structure and spatial layout of symbolically defined locations. To perform goal-directed exploration, the abstract map creates a high-level semantic plan to reason about spaces beyond the robot’s known world. While completing the plan, the robot uses the metric guidance provided by a spatial layout, and grounded observations of door labels, to efficiently guide its navigation. The system is shown to complete exploration in unexplored spaces by travelling only 13.3% further than the optimal path.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we focus on the challenging problem of place categorization and semantic mapping on a robot with-out environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot’s behaviour during navigation tasks. The system is made available to the community as a ROS module.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A straightforward computation of the list of the words (the `tail words' of the list) that are distributionally most similar to a given word (the `head word' of the list) leads to the question: How semantically similar to the head word are the tail words; that is: how similar are their meanings to its meaning? And can we do better? The experiment was done on nearly 18,000 most frequent nouns in a Finnish newsgroup corpus. These nouns are considered to be distributionally similar to the extent that they occur in the same direct dependency relations with the same nouns, adjectives and verbs. The extent of the similarity of their computational representations is quantified with the information radius. The semantic classification of head-tail pairs is intuitive; some tail words seem to be semantically similar to the head word, some do not. Each such pair is also associated with a number of further distributional variables. Individually, their overlap for the semantic classes is large, but the trained classification-tree models have some success in using combinations to predict the semantic class. The training data consists of a random sample of 400 head-tail pairs with the tail word ranked among the 20 distributionally most similar to the head word, excluding names. The models are then tested on a random sample of another 100 such pairs. The best success rates range from 70% to 92% of the test pairs, where a success means that the model predicted my intuitive semantic class of the pair. This seems somewhat promising when distributional similarity is used to capture semantically similar words. This analysis also includes a general discussion of several different similarity formulas, arranged in three groups: those that apply to sets with graded membership, those that apply to the members of a vector space, and those that apply to probability mass functions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent evidence from adult pronoun comprehension suggests that semantic factors such as verb transitivity affect referent salience and thereby anap- hora resolution. We tested whether the same semantic factors influence pronoun comprehension in young children. In a visual world study, 3-year- olds heard stories that began with a sentence containing either a high or a low transitivity verb. Looking behaviour to pictures depicting the subject and object of this sentence was recorded as children listened to a subsequent sentence containing a pronoun. Children showed a stronger preference to look to the subject as opposed to the object antecedent in the low transitivity condition. In addition there were general preferences (1) to look to the subject in both conditions and (2) to look more at both potential antecedents in the high transitivity condition. This suggests that children, like adults, are affected by semantic factors, specifically semantic prominence, when interpreting anaphoric pronouns.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Knowledge-based clusters are studied from the structural point of view. Generalized descriptions for such clusters are stated and illustrated. Peculiarities of certain knowledge-based cluster configurations are highlighted. The adequacy of the connectives logical and (“and”) logical or (“exclusive-or”) in describing such clusters is justified. The definition of “concept” is elaborated from the clustering point of view and used to establish the equivalence between, descriptions of clusters and concepts. The order-independence of semantic-directed clustering approach is established formally based on axiomatic considerations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

CMPs enable simultaneous execution of multiple applications on the same platforms that share cache resources. Diversity in the cache access patterns of these simultaneously executing applications can potentially trigger inter-application interference, leading to cache pollution. Whereas a large cache can ameliorate this problem, the issues of larger power consumption with increasing cache size, amplified at sub-100nm technologies, makes this solution prohibitive. In this paper in order to address the issues relating to power-aware performance of caches, we propose a caching structure that addresses the following: 1. Definition of application-specific cache partitions as an aggregation of caching units (molecules). The parameters of each molecule namely size, associativity and line size are chosen so that the power consumed by it and access time are optimal for the given technology. 2. Application-Specific resizing of cache partitions with variable and adaptive associativity per cache line, way size and variable line size. 3. A replacement policy that is transparent to the partition in terms of size, heterogeneity in associativity and line size. Through simulation studies we establish the superiority of molecular cache (caches built as aggregations of molecules) that offers a 29% power advantage over that of an equivalently performing traditional cache.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Software transactional memory (STM) is a promising programming paradigm for shared memory multithreaded programs as an alternative to traditional lock based synchronization. However adoption of STM in mainstream software has been quite low due to its considerable overheads and its poor cache/memory performance. In this paper, we perform a detailed study of the cache behavior of STM applications and quantify the impact of different STM factors on the cache misses experienced by the applications. Based on our analysis, we propose a compiler driven Lock-Data Colocation (LDC), targeted at reducing the cache overheads on STM. We show that LDC is effective in improving the cache behavior of STM applications by reducing the dcache miss latency and improving execution time performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work, we propose a new organization for the last level shared cache of a rnulticore system. Our design is based on the observation that the Next-Use distance, measured in terms of intervening misses between the eviction of a line and its next use, for lines brought in by a given delinquent PC falls within a predictable range of values. We exploit this correlation to improve the performance of shared caches in multi-core architectures by proposing the NUcache organization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a cache coherence protocol for multistage interconnection network (MIN)-based multiprocessors with two distinct private caches: private-blocks caches (PCache) containing blocks private to a process and shared-blocks caches (SCache) containing data accessible by all processes. The architecture is extended by a coherence control bus connecting all shared-block cache controllers. Timing problems due to variable transit delays through the MIN are dealt with by introducing Transient states in the proposed cache coherence protocol. The impact of the coherence protocol on system performance is evaluated through a performance study of three phases. Assuming homogeneity of all nodes, a single-node queuing model (phase 3) is developed to analyze system performance. This model is solved for processor and coherence bus utilizations using the mean value analysis (MVA) technique with shared-blocks steady state probabilities (phase 1) and communication delays (phase 2) as input parameters. The performance of our system is compared to that of a system with an equivalent-sized unified cache and with a multiprocessor implementing a directory-based coherence protocol. System performance measures are verified through simulation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Several replacement policies for web caches have been proposed and studied extensively in the literature. Different replacement policies perform better in terms of (i) the number of objects found in the cache (cache hit), (ii) the network traffic avoided by fetching the referenced object from the cache, or (iii) the savings in response time. In this paper, we propose a simple and efficient replacement policy (hereafter known as SE) which improves all three performance measures. Trace-driven simulations were done to evaluate the performance of SE. We compare SE with two widely used and efficient replacement policies, namely Least Recently Used (LRU) and Least Unified Value (LUV) algorithms. Our results show that SE performs at least as well as, if not better than, both these replacement policies. Unlike various other replacement policies proposed in literature, our SE policy does not require parameter tuning or a-priori trace analysis and has an efficient and simple implementation that can be incorporated in any existing proxy server or web server with ease.