9 resultados para Vector Space IR, Search Engines, Document Clustering, Document

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a hierarchical clustering method for semantic Web service discovery. This method aims to improve the accuracy and efficiency of the traditional service discovery using vector space model. The Web service is converted into a standard vector format through the Web service description document. With the help of WordNet, a semantic analysis is conducted to reduce the dimension of the term vector and to make semantic expansion to meet the user’s service request. The process and algorithm of hierarchical clustering based semantic Web service discovery is discussed. Validation is carried out on the dataset.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Search has become a hot topic in Internet computing, with rival search engines battling to become the de facto Web portal, harnessing search algorithms to wade through information on a scale undreamed of by early information retrieval (IR) pioneers. This article examines how search has matured from its roots in specialized IR systems to become a key foundation of the Web. The authors describe new challenges posed by the Web's scale, and show how search is changing the nature of the Web as much as the Web has changed the nature of search

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Stochastic Diffusion Search (SDS) was developed as a solution to the best-fit search problem. Thus, as a special case it is capable of solving the transform invariant pattern recognition problem. SDS is efficient and, although inherently probabilistic, produces very reliable solutions in widely ranging search conditions. However, to date a systematic formal investigation of its properties has not been carried out. This thesis addresses this problem. The thesis reports results pertaining to the global convergence of SDS as well as characterising its time complexity. However, the main emphasis of the work, reports on the resource allocation aspect of the Stochastic Diffusion Search operations. The thesis introduces a novel model of the algorithm, generalising an Ehrenfest Urn Model from statistical physics. This approach makes it possible to obtain a thorough characterisation of the response of the algorithm in terms of the parameters describing the search conditions in case of a unique best-fit pattern in the search space. This model is further generalised in order to account for different search conditions: two solutions in the search space and search for a unique solution in a noisy search space. Also an approximate solution in the case of two alternative solutions is proposed and compared with predictions of the extended Ehrenfest Urn model. The analysis performed enabled a quantitative characterisation of the Stochastic Diffusion Search in terms of exploration and exploitation of the search space. It appeared that SDS is biased towards the latter mode of operation. This novel perspective on the Stochastic Diffusion Search lead to an investigation of extensions of the standard SDS, which would strike a different balance between these two modes of search space processing. Thus, two novel algorithms were derived from the standard Stochastic Diffusion Search, ‘context-free’ and ‘context-sensitive’ SDS, and their properties were analysed with respect to resource allocation. It appeared that they shared some of the desired features of their predecessor but also possessed some properties not present in the classic SDS. The theory developed in the thesis was illustrated throughout with carefully chosen simulations of a best-fit search for a string pattern, a simple but representative domain, enabling careful control of search conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study examines the evolution of prices in markets with Internet price-comparison search engines. The empirical study analyzes laboratory data of prices available to informed consumers, for two industry sizes and two conditions on the sample (complete and incomplete). Distributions are typically bimodal. One of the two modes of distribution, corresponding to monopoly pricing, tends to attract such pricing strategies increasingly over time. The second one, corresponding to interior pricing, follows a decreasing trend. Monopoly pricing can serve as a means of insurance against more competitive (but riskier) behavior. In fact, experimental subjects who initially earn low profits due to interior pricing are more likely to switch to monopoly pricing than subjects who experience good returns from the start.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article is concerned with the liability of search engines for algorithmically produced search suggestions, such as through Google’s ‘autocomplete’ function. Liability in this context may arise when automatically generated associations have an offensive or defamatory meaning, or may even induce infringement of intellectual property rights. The increasing number of cases that have been brought before courts all over the world puts forward questions on the conflict of fundamental freedoms of speech and access to information on the one hand, and personality rights of individuals— under a broader right of informational self-determination—on the other. In the light of the recent judgment of the Court of Justice of the European Union (EU) in Google Spain v AEPD, this article concludes that many requests for removal of suggestions including private individuals’ information will be successful on the basis of EU data protection law, even absent prejudice to the person concerned.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Search engines exploit the Web's hyperlink structure to help infer information content. The new phenomenon of personal Web logs, or 'blogs', encourage more extensive annotation of Web content. If their resulting link structures bias the Web crawling applications that search engines depend upon, there are implications for another form of annotation rapidly on the rise, the Semantic Web. We conducted a Web crawl of 160 000 pages in which the link structure of the Web is compared with that of several thousand blogs. Results show that the two link structures are significantly different. We analyse the differences and infer the likely effect upon the performance of existing and future Web agents. The Semantic Web offers new opportunities to navigate the Web, but Web agents should be designed to take advantage of the emerging link structures, or their effectiveness will diminish.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article is concerned with the risks associated with the monopolisation of information that is available from a single source only. Although there is a longstanding consensus that sole-source databases should not receive protection under the EU Database Directive, and there are legislative provisions to ensure that lawful users have access to a database’s contents, Ryanair v PR Aviation challenges this assumption by affirming that the use of non-protected databases can be restricted by contract. Owners of non-protected databases can contractually exclude lawful users from taking the benefit of statutorily permitted uses, because such databases are not covered from the legislation that declares this kind of contract null and void. We argue that this judgment is not consistent with the legislative history and can have a profound impact on the functioning of the digital single market, where new information services, such as meta-search engines or price-comparison websites, base their operation on the systematic extraction and re-utilisation of materials available from online sources. This is an issue that the Commission should address in a forthcoming evaluation of the Database Directive.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This paper deals with the design of optimal multiple gravity assist trajectories with deep space manoeuvres. A pruning method which considers the sequential nature of the problem is presented. The method locates feasible vectors using local optimization and applies a clustering algorithm to find reduced bounding boxes which can be used in a subsequent optimization step. Since multiple local minima remain within the pruned search space, the use of a global optimization method, such as Differential Evolution, is suggested for finding solutions which are likely to be close to the global optimum. Two case studies are presented.