12 resultados para XML, Information, Retrieval, Query, Language

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first "k" singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k x k matrix) from where we then find the first "k" triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the "k"-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one. The algorithms are running on a cluster of workstations under MPI and results of the experiments arising in textual retrieval of Web documents as well as comparison of the stochastic methods proposed are presented. (C) 2003 IMACS. Published by Elsevier Science B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A large volume of visual content is inaccessible until effective and efficient indexing and retrieval of such data is achieved. In this paper, we introduce the DREAM system, which is a knowledge-assisted semantic-driven context-aware visual information retrieval system applied in the film post production domain. We mainly focus on the automatic labelling and topic map related aspects of the framework. The use of the context- related collateral knowledge, represented by a novel probabilistic based visual keyword co-occurrence matrix, had been proven effective via the experiments conducted during system evaluation. The automatically generated semantic labels were fed into the Topic Map Engine which can automatically construct ontological networks using Topic Maps technology, which dramatically enhances the indexing and retrieval performance of the system towards an even higher semantic level.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In general, ranking entities (resources) on the Semantic Web (SW) is subject to importance, relevance, and query length. Few existing SW search systems cover all of these aspects. Moreover, many existing efforts simply reuse the technologies from conventional Information Retrieval (IR), which are not designed for SW data. This paper proposes a ranking mechanism, which includes all three categories of rankings and are tailored to SW data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Search has become a hot topic in Internet computing, with rival search engines battling to become the de facto Web portal, harnessing search algorithms to wade through information on a scale undreamed of by early information retrieval (IR) pioneers. This article examines how search has matured from its roots in specialized IR systems to become a key foundation of the Web. The authors describe new challenges posed by the Web's scale, and show how search is changing the nature of the Web as much as the Web has changed the nature of search

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Web's link structure (termed the Web Graph) is a richly connected set of Web pages. Current applications use this graph for indexing and information retrieval purposes. In contrast the relationship between Web Graph and application is reversed by letting the structure of the Web Graph influence the behaviour of an application. Presents a novel Web crawling agent, AlienBot, the output of which is orthogonally coupled to the enemy generation strategy of a computer game. The Web Graph guides AlienBot, causing it to generate a stochastic process. Shows the effectiveness of such unorthodox coupling to both the playability of the game and the heuristics of the Web crawler. In addition, presents the results of the sample of Web pages collected by the crawling process. In particular, shows: how AlienBot was able to identify the power law inherent in the link structure of the Web; that 61.74 per cent of Web pages use some form of scripting technology; that the size of the Web can be estimated at just over 5.2 billion pages; and that less than 7 per cent of Web pages fully comply with some variant of (X)HTML.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A quasi-optical interferometric technique capable of measuring antenna phase patterns without the need for a heterodyne receiver is presented. It is particularly suited to the characterization of terahertz antennas feeding power detectors or mixers employing quasi-optical local oscillator injection. Examples of recorded antenna phase patterns at frequencies of 1.4 and 2.5 THz using homodyne detectors are presented. To our knowledge, these are the highest frequency antenna phase patterns ever recovered. Knowledge of both the amplitude and phase patterns in the far field enable a Gauss-Hermite or Gauss-Laguerre beam-mode analysis to be carried out for the antenna, of importance in performance optimization calculations, such as antenna gain and beam efficiency parameters at the design and prototype stage of antenna development. A full description of the beam would also be required if the antenna is to be used to feed a quasi-optical system in the near-field to far-field transition region. This situation could often arise when the device is fitted directly at the back of telescopes in flying observatories. A further benefit of the proposed technique is simplicity for characterizing systems in situ, an advantage of considerable importance as in many situations, the components may not be removable for further characterization once assembled. The proposed methodology is generic and should be useful across the wider sensing community, e.g., in single detector acoustic imaging or in adaptive imaging array applications. Furthermore, it is applicable across other frequencies of the EM spectrum, provided adequate spatial and temporal phase stability of the source can be maintained throughout the measurement process. Phase information retrieval is also of importance to emergent research areas, such as band-gap structure characterization, meta-materials research, electromagnetic cloaking, slow light, super-lens design as well as near-field and virtual imaging applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Information systems integration becomes critical in enhancing organisational competitiveness through effective use of information resource provided by the whole host of information systems. Information systems integration in its nature is a process of bringing about the capability of communication and information exchange between systems; while interoperability, often as the result of systems integration, is such a capability. However currently there is a lack of theoretical foundation for representation and measure of the interoperability in organisations. Organisational semiotics provides a theoretical foundation for systems interoperability. A notion of ‘semiotic interoperability’ is proposed in this paper as a paradigm, guiding systems integration and measuring degree of interoperability, covering aspects from physical properties, transmission structure of signs, placing emphasis on communicating meaning, intention to social consequence of information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.