998 resultados para Entity-oriented Retrieval


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditional information retrieval (IR) systems respond to user queries with ranked lists of relevant documents. The separation of content and structure in XML documents allows individual XML elements to be selected in isolation. Thus, users expect XML-IR systems to return highly relevant results that are more precise than entire documents. In this paper we describe the implementation of a search engine for XML document collections. The system is keyword based and is built upon an XML inverted file system. We describe the approach that was adopted to meet the requirements of Content Only (CO) and Vague Content and Structure (VCAS) queries in INEX 2004.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the application of GIS methodologies to spatial data, researchers can now identify patterns of occurrence for many social problems including health-issues and crime. Further more, since this type of data also contains clues as to the underlying causes of social problems, it can be used to make well-educated and consequently, more effective policy decisions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – Financial information about costs and return on investments are of key importance to strategic decision-making but also in the context of process improvement or business engineering. In this paper we propose a value-oriented approach to business process modeling based on key concepts and metrics from operations and financial management, to aid decision making in process re-design projects on the basis of process models. Design/methodology/approach – We suggest a theoretically founded extension to current process modeling approaches, and delineate a framework as well as methodical support to incorporate financial information into process re-design. We use two case studies to evaluate the suggested approach. Findings – Based on two case studies, we show that the value-oriented process modeling approach facilitates and improves managerial decision-making in the context of process re-design. Research limitations / implications – We present design work and two case studies. More research is needed to more thoroughly evaluate the presented approach in a variety of real-life process modeling settings. Practical implications – We show how our approach enables decision makers to make investment decisions in process re-design projects, and also how other decisions, for instance in the context of enterprise architecture design, can be facilitated. Originality/value – This study reports on an attempt to integrate financial considerations into the act of process modeling, in order to provide more comprehensive decision making support in process re-design projects.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the advent of Service Oriented Architecture, Web Services have gained tremendous popularity. Due to the availability of a large number of Web services, finding an appropriate Web service according to the requirement of the user is a challenge. This warrants the need to establish an effective and reliable process of Web service discovery. A considerable body of research has emerged to develop methods to improve the accuracy of Web service discovery to match the best service. The process of Web service discovery results in suggesting many individual services that partially fulfil the user’s interest. By considering the semantic relationships of words used in describing the services as well as the use of input and output parameters can lead to accurate Web service discovery. Appropriate linking of individual matched services should fully satisfy the requirements which the user is looking for. This research proposes to integrate a semantic model and a data mining technique to enhance the accuracy of Web service discovery. A novel three-phase Web service discovery methodology has been proposed. The first phase performs match-making to find semantically similar Web services for a user query. In order to perform semantic analysis on the content present in the Web service description language document, the support-based latent semantic kernel is constructed using an innovative concept of binning and merging on the large quantity of text documents covering diverse areas of domain of knowledge. The use of a generic latent semantic kernel constructed with a large number of terms helps to find the hidden meaning of the query terms which otherwise could not be found. Sometimes a single Web service is unable to fully satisfy the requirement of the user. In such cases, a composition of multiple inter-related Web services is presented to the user. The task of checking the possibility of linking multiple Web services is done in the second phase. Once the feasibility of linking Web services is checked, the objective is to provide the user with the best composition of Web services. In the link analysis phase, the Web services are modelled as nodes of a graph and an allpair shortest-path algorithm is applied to find the optimum path at the minimum cost for traversal. The third phase which is the system integration, integrates the results from the preceding two phases by using an original fusion algorithm in the fusion engine. Finally, the recommendation engine which is an integral part of the system integration phase makes the final recommendations including individual and composite Web services to the user. In order to evaluate the performance of the proposed method, extensive experimentation has been performed. Results of the proposed support-based semantic kernel method of Web service discovery are compared with the results of the standard keyword-based information-retrieval method and a clustering-based machine-learning method of Web service discovery. The proposed method outperforms both information-retrieval and machine-learning based methods. Experimental results and statistical analysis also show that the best Web services compositions are obtained by considering 10 to 15 Web services that are found in phase-I for linking. Empirical results also ascertain that the fusion engine boosts the accuracy of Web service discovery by combining the inputs from both the semantic analysis (phase-I) and the link analysis (phase-II) in a systematic fashion. Overall, the accuracy of Web service discovery with the proposed method shows a significant improvement over traditional discovery methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Peer to peer systems have been widely used in the internet. However, most of the peer to peer information systems are still missing some of the important features, for example cross-language IR (Information Retrieval) and collection selection / fusion features. Cross-language IR is the state-of-art research area in IR research community. It has not been used in any real world IR systems yet. Cross-language IR has the ability to issue a query in one language and receive documents in other languages. In typical peer to peer environment, users are from multiple countries. Their collections are definitely in multiple languages. Cross-language IR can help users to find documents more easily. E.g. many Chinese researchers will search research papers in both Chinese and English. With Cross-language IR, they can do one query in Chinese and get documents in two languages. The Out Of Vocabulary (OOV) problem is one of the key research areas in crosslanguage information retrieval. In recent years, web mining was shown to be one of the effective approaches to solving this problem. However, how to extract Multiword Lexical Units (MLUs) from the web content and how to select the correct translations from the extracted candidate MLUs are still two difficult problems in web mining based automated translation approaches. Discovering resource descriptions and merging results obtained from remote search engines are two key issues in distributed information retrieval studies. In uncooperative environments, query-based sampling and normalized-score based merging strategies are well-known approaches to solve such problems. However, such approaches only consider the content of the remote database but do not consider the retrieval performance of the remote search engine. This thesis presents research on building a peer to peer IR system with crosslanguage IR and advance collection profiling technique for fusion features. Particularly, this thesis first presents a new Chinese term measurement and new Chinese MLU extraction process that works well on small corpora. An approach to selection of MLUs in a more accurate manner is also presented. After that, this thesis proposes a collection profiling strategy which can discover not only collection content but also retrieval performance of the remote search engine. Based on collection profiling, a web-based query classification method and two collection fusion approaches are developed and presented in this thesis. Our experiments show that the proposed strategies are effective in merging results in uncooperative peer to peer environments. Here, an uncooperative environment is defined as each peer in the system is autonomous. Peer like to share documents but they do not share collection statistics. This environment is a typical peer to peer IR environment. Finally, all those approaches are grouped together to build up a secure peer to peer multilingual IR system that cooperates through X.509 and email system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Digital forensics relates to the investigation of a crime or other suspect behaviour using digital evidence. Previous work has dealt with the forensic reconstruction of computer-based activity on single hosts, but with the additional complexity involved with a distributed environment, a Web services-centric approach is required. A framework for this type of forensic examination needs to allow for the reconstruction of transactions spanning multiple hosts, platforms and applications. A tool implementing such an approach could be used by an investigator to identify scenarios of Web services being misused, exploited, or otherwise compromised. This information could be used to redesign Web services in order to mitigate identified risks. This paper explores the requirements of a framework for performing effective forensic examinations in a Web services environment. This framework will be necessary in order to develop forensic tools and techniques for use in service oriented architectures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Measuring quality attributes of object-oriented designs (e.g. maintainability and performance) has been covered by a number of studies. However, these studies have not considered security as much as other quality attributes. Also, most security studies focus at the level of individual program statements. This approach makes it hard and expensive to discover and fix vulnerabilities caused by design errors. In this work, we focus on the security design of an object oriented application and define a number of security metrics. These metrics allow designers to discover and fix security vulnerabilities at an early stage, and help compare the security of various alternative designs. In particular, we propose seven security metrics to measure Data Encapsulation (accessibility) and Cohesion (interactions) of a given object-oriented class from the point of view of potential information flow.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Travel surveys were conducted for collecting data related to residents’ travel at Kelvin Grove Urban Village (KGUV). Currently, KGUV has residents living in the affordable apartments, apartments, townhouses and student accommodation. As a part of data collection process, travel surveys were undertaken for residents living in apartments, affordable apartments and student accommodation. This document contains the questionnaire form used to collect the demographic and travel data related to residents at KGUV. A mail back survey technique was used to collect data for residents living in affordable apartment and apartments, and an intercept surveys was conducted for residents living in student accommodation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The first Workshop on Service-Oriented Business Networks and Ecosystems (SOBNE ’09) is held in conjunction with the 13th IEEE International EDOC Conference on 2 September 2009 in Auckland, New Zealand. The SOBNE ’09 program includes 9 peer-reviewed papers (7 full and 2 short papers) and an open discussion session. This introduction to the Proceedings of SOBNE ’09 starts with a brief background of the motivation for the workshop. Next, it contains a short description of the peer-reviewed papers, and finally, after some concluding statements and the announcement of the winners of the Best Reviewer Award and the Most Promising Research Award, it lists the members of the SOBNE ’09 Program Committee and external reviewers of the workshop submissions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cultural objects are increasingly generated and stored in digital form, yet effective methods for their indexing and retrieval still remain an important area of research. The main problem arises from the disconnection between the content-based indexing approach used by computer scientists and the description-based approach used by information scientists. There is also a lack of representational schemes that allow the alignment of the semantics and context with keywords and low-level features that can be automatically extracted from the content of these cultural objects. This paper presents an integrated approach to address these problems, taking advantage of both computer science and information science approaches. We firstly discuss the requirements from a number of perspectives: users, content providers, content managers and technical systems. We then present an overview of our system architecture and describe various techniques which underlie the major components of the system. These include: automatic object category detection; user-driven tagging; metadata transform and augmentation, and an expression language for digital cultural objects. In addition, we discuss our experience on testing and evaluating some existing collections, analyse the difficulties encountered and propose ways to address these problems.