648 resultados para QUERIES
Resumo:
This paper aims to identify the communication goal(s) of a user's information-seeking query out of a finite set of within-domain goals in natural language queries. It proposes using Tree-Augmented Naive Bayes networks (TANs) for goal detection. The problem is formulated as N binary decisions, and each is performed by a TAN. Comparative study has been carried out to compare the performance with Naive Bayes, fully-connected TANs, and multi-layer neural networks. Experimental results show that TANs consistently give better results when tested on the ATIS and DARPA Communicator corpora.
Resumo:
The problem of finding the optimal join ordering executing a query to a relational database management system is a combinatorial optimization problem, which makes deterministic exhaustive solution search unacceptable for queries with a great number of joined relations. In this work an adaptive genetic algorithm with dynamic population size is proposed for optimizing large join queries. The performance of the algorithm is compared with that of several classical non-deterministic optimization algorithms. Experiments have been performed optimizing several random queries against a randomly generated data dictionary. The proposed adaptive genetic algorithm with probabilistic selection operator outperforms in a number of test runs the canonical genetic algorithm with Elitist selection as well as two common random search strategies and proves to be a viable alternative to existing non-deterministic optimization approaches.
Resumo:
Large read-only or read-write transactions with a large read set and a small write set constitute an important class of transactions used in such applications as data mining, data warehousing, statistical applications, and report generators. Such transactions are best supported with optimistic concurrency, because locking of large amounts of data for extended periods of time is not an acceptable solution. The abort rate in regular optimistic concurrency algorithms increases exponentially with the size of the transaction. The algorithm proposed in this dissertation solves this problem by using a new transaction scheduling technique that allows a large transaction to commit safely with significantly greater probability that can exceed several orders of magnitude versus regular optimistic concurrency algorithms. A performance simulation study and a formal proof of serializability and external consistency of the proposed algorithm are also presented.^ This dissertation also proposes a new query optimization technique (lazy queries). Lazy Queries is an adaptive query execution scheme which optimizes itself as the query runs. Lazy queries can be used to find an intersection of sub-queries in a very efficient way, which does not require full execution of large sub-queries nor does it require any statistical knowledge about the data.^ An efficient optimistic concurrency control algorithm used in a massively parallel B-tree with variable-length keys is introduced. B-trees with variable-length keys can be effectively used in a variety of database types. In particular, we show how such a B-tree was used in our implementation of a semantic object-oriented DBMS. The concurrency control algorithm uses semantically safe optimistic virtual "locks" that achieve very fine granularity in conflict detection. This algorithm ensures serializability and external consistency by using logical clocks and backward validation of transactional queries. A formal proof of correctness of the proposed algorithm is also presented. ^
Resumo:
This project is about retrieving data in range without allowing the server to read it, when the database is stored in the server. Basically, our goal is to build a database that allows the client to maintain the confidentiality of the data stored, despite all the data is stored in a different location from the client's hard disk. This means that all the information written on the hard disk can be easily read by another person who can do anything with it. Given that, we need to encrypt that data from eavesdroppers or other people. This is because they could sell it or log into accounts and use them for stealing money or identities. In order to achieve this, we need to encrypt the data stored in the hard drive, so that only the possessor of the key can easily read the information stored, while all the others are going to read only encrypted data. Obviously, according to that, all the data management must be done by the client, otherwise any malicious person can easily retrieve it and use it for any malicious intention. All the methods analysed here relies on encrypting data in transit. In the end of this project we analyse 2 theoretical and practical methods for the creation of the above databases and then we tests them with 3 datasets and with 10, 100 and 1000 queries. The scope of this work is to retrieve a trend that can be useful for future works based on this project.
Resumo:
This thesis focuses on the private membership test (PMT) problem and presents three single server protocols to resolve this problem. In the presented solutions, a client can perform an inclusion test for some record x in a server's database, without revealing his record. Moreover after executing the protocols, the contents of server's database remain secret. In each of these solutions, a different cryptographic protocol is utilized to construct a privacy preserving variant of Bloom filter. The three suggested solutions are slightly different from each other, from privacy perspective and also from complexity point of view. Therefore, their use cases are different and it is impossible to choose one that is clearly the best between all three. We present the software developments of the three protocols by utilizing various pseudocodes. The performance of our implementation is measured based on a real case scenario. This thesis is a spin-off from the Academy of Finland research project "Cloud Security Services".
Bioqueries: a collaborative environment to create, explore and share SPARQL queries in Life Sciences
Resumo:
Bioqueries provides a collaborative environment to create, explore, execute, clone and share SPARQL queries (including Federated Queries). Federated SPARQL queries can retrieve information from more than one data source.
Bioqueries: a collaborative environment to create, explore and share SPARQL queries in Life Sciences
Resumo:
Bioqueries provides a collaborative environment to create, explore, execute, clone and share SPARQL queries (including Federated Queries). Federated SPARQL queries can retrieve information from more than one data source.
Resumo:
The ability of agents and services to automatically locate and interact with unknown partners is a goal for both the semantic web and web services. This, \serendipitous interoperability", is hindered by the lack of an explicit means of describing what services (or agents) are able to do, that is, their capabilities. At present, informal descriptions of what services can do are found in \documentation" elements; or they are somehow encoded in operation names and signatures. We show, by ref- erence to existing service examples, how ambiguous and imprecise capa- bility descriptions hamper the attainment of automated interoperability goals in the open, global web environment. In this paper we propose a structured, machine readable description of capabilities, which may help to increase the recall and precision of service discovery mechanisms. Our capability description draws on previous work in capability and process modeling and allows the incorporation of external classi¯cation schemes. The capability description is presented as a conceptual meta model. The model supports conceptual queries and can be used as an extension to the DAML-S Service Pro¯le.
Resumo:
Traditional information retrieval (IR) systems respond to user queries with ranked lists of relevant documents. The separation of content and structure in XML documents allows individual XML elements to be selected in isolation. Thus, users expect XML-IR systems to return highly relevant results that are more precise than entire documents. In this paper we describe the implementation of a search engine for XML document collections. The system is keyword based and is built upon an XML inverted file system. We describe the approach that was adopted to meet the requirements of Content Only (CO) and Vague Content and Structure (VCAS) queries in INEX 2004.
Resumo:
The Chaser’s War on Everything is a night time entertainment program which screened on Australia’s public broadcaster, the ABC in 2006 and 2007. This enormously successful comedy show managed to generate a lot of controversy in its short lifespan (see, for example, Dennehy, 2007; Dubecki, 2007; McLean, 2007; Wright, 2007), but also drew much praise for its satirising of, and commentary on, topical issues. Through interviews with the program’s producers, qualitative audience research and textual analysis, this paper will focus on this show’s media satire, and the segment ‘What Have We Learned From Current Affairs This Week?’ in particular. Viewed as a form of ‘Critical Intertextuality’ (Gray, 2006), this segment (which offered a humorous critique of the ways in which news and current affairs are presented elsewhere on television) may equip citizens with a better understanding of the new genre’s production methods, thus producing a higher level of public media literacy. This paper argues that through its media satire, The Chaser acts not as a traditional news program would in informing the public with new information, but as a text which can inform and shape our understanding of news that already exists within the public sphere. Humorous analyses and critiques of the media (like those analysed in this paper), are in fact very important forms of infotainment, because they can provide “other, ‘improper,’ and yet more media literate and savvy interpretations” (Gray, 2006, p. 4) of the news.
Resumo:
“SOH see significant benefit in digitising its drawings and operation and maintenance manuals. Since SOH do not currently have digital models of the Opera House structure or other components, there is an opportunity for this national case study to promote the application of Digital Facility Modelling using standardized Building Information Models (BIM)”. The digital modelling element of this project examined the potential of building information models for Facility Management focusing on the following areas: • The re-usability of building information for FM purposes • BIM as an Integrated information model for facility management • Extendibility of the BIM to cope with business specific requirements • Commercial facility management software using standardised building information models • The ability to add (organisation specific) intelligence to the model • A roadmap for SOH to adopt BIM for FM The project has established that BIM – building information modelling - is an appropriate and potentially beneficial technology for the storage of integrated building, maintenance and management data for SOH. Based on the attributes of a BIM, several advantages can be envisioned: consistency in the data, intelligence in the model, multiple representations, source of information for intelligent programs and intelligent queries. The IFC – open building exchange standard – specification provides comprehensive support for asset and facility management functions, and offers new management, collaboration and procurement relationships based on sharing of intelligent building data. The major advantages of using an open standard are: information can be read and manipulated by any compliant software, reduced user “lock in” to proprietary solutions, third party software can be the “best of breed” to suit the process and scope at hand, standardised BIM solutions consider the wider implications of information exchange outside the scope of any particular vendor, information can be archived as ASCII files for archival purposes, and data quality can be enhanced as the now single source of users’ information has improved accuracy, correctness, currency, completeness and relevance. SOH current building standards have been successfully drafted for a BIM environment and are confidently expected to be fully developed when BIM is adopted operationally by SOH. There have been remarkably few technical difficulties in converting the House’s existing conventions and standards to the new model based environment. This demonstrates that the IFC model represents world practice for building data representation and management (see Sydney Opera House – FM Exemplar Project Report Number 2005-001-C-3, Open Specification for BIM: Sydney Opera House Case Study). Availability of FM applications based on BIM is in its infancy but focussed systems are already in operation internationally and show excellent prospects for implementation systems at SOH. In addition to the generic benefits of standardised BIM described above, the following FM specific advantages can be expected from this new integrated facilities management environment: faster and more effective processes, controlled whole life costs and environmental data, better customer service, common operational picture for current and strategic planning, visual decision-making and a total ownership cost model. Tests with partial BIM data – provided by several of SOH’s current consultants – show that the creation of a SOH complete model is realistic, but subject to resolution of compliance and detailed functional support by participating software applications. The showcase has demonstrated successfully that IFC based exchange is possible with several common BIM based applications through the creation of a new partial model of the building. Data exchanged has been geometrically accurate (the SOH building structure represents some of the most complex building elements) and supports rich information describing the types of objects, with their properties and relationships.
Resumo:
The construction industry is categorised as being an information-intensive industry and described as one of the most important industries in any developed country, facing a period of rapid and unparalleled change (Industry Science Resources 1999) (Love P.E.D., Tucker S.N. et al. 1996). Project communications are becoming increasingly complex, with a growing need and fundamental drive to collaborate electronically at project level and beyond (Olesen K. and Myers M.D. 1999; Thorpe T. and Mead S. 2001; CITE 2003). Yet, the industry is also identified as having a considerable lack of knowledge and awareness about innovative information and communication technology (ICT) and web-based communication processes, systems and solutions which may prove beneficial in the procurement, delivery and life cycle of projects (NSW Government 1998; Kajewski S. and Weippert A. 2000). The Internet has debatably revolutionised the way in which information is stored, exchanged and viewed, opening new avenues for business, which only a decade ago were deemed almost inconceivable (DCITA 1998; IIB 2002). In an attempt to put these ‘new avenues of business’ into perspective, this report provides an overall ‘snapshot’ of current public and private construction industry sector opportunities and practices in the implementation and application of web-based ICT tools, systems and processes (e-Uptake). Research found that even with a reserved uptake, the construction industry and its participating organisations are making concerted efforts (fortunately with positive results) in taking up innovative forms of doing business via the internet, including e-Tendering (making it possible to manage the entire tender letting process electronically and online) (Anumba C.J. and Ruikar K. 2002; ITCBP 2003). Furthermore, Government (often a key client within the construction industry),and with its increased tendency to transact its business electronically, undoubtedly has an effect on how various private industry consultants, contractors, suppliers, etc. do business (Murray M. 2003) – by offering a wide range of (current and anticipated) e-facilities / services, including e-Tendering (Ecommerce 2002). Overall, doing business electronically is found to have a profound impact on the way today’s construction businesses operate - streamlining existing processes, with the growth in innovative tools, such as e-Tender, offering the construction industry new responsibilities and opportunities for all parties involved (ITCBP 2003). It is therefore important that these opportunities should be accessible to as many construction industry businesses as possible (The Construction Confederation 2001). Historically, there is a considerable exchange of information between various parties during a tendering process, where accuracy and efficiency of documentation is critical. Traditionally this process is either paper-based (involving large volumes of supporting tender documentation), or via a number of stand-alone, non-compatible computer systems, usually costly to both the client and contractor. As such, having a standard electronic exchange format that allows all parties involved in an electronic tender process to access one system only via the Internet, saves both time and money, eliminates transcription errors and increases speed of bid analysis (The Construction Confederation 2001). Supporting this research project’s aims and objectives, researchers set to determine today’s construction industry ‘current state-of-play’ in relation to e-Tendering opportunities. The report also provides brief introductions to several Australian and International e-Tender systems identified during this investigation. e-Tendering, in its simplest form, is described as the electronic publishing, communicating, accessing, receiving and submitting of all tender related information and documentation via the internet, thereby replacing the traditional paper-based tender processes, and achieving a more efficient and effective business process for all parties involved (NT Governement 2000; NT Government 2000; NSW Department of Commerce 2003; NSW Government 2003). Although most of the e-Tender websites investigated at the time, maintain their tendering processes and capabilities are ‘electronic’, research shows these ‘eTendering’ systems vary from being reasonably advanced to more ‘basic’ electronic tender notification and archiving services for various industry sectors. Research also indicates an e-Tender system should have a number of basic features and capabilities, including: • All tender documentation to be distributed via a secure web-based tender system – thereby avoiding the need for collating paperwork and couriers. • The client/purchaser should be able to upload a notice and/or invitation to tender onto the system. • Notification is sent out electronically (usually via email) for suppliers to download the information and return their responses electronically (online). • During the tender period, updates and queries are exchanged through the same e-Tender system. • The client/purchaser should only be able to access the tenders after the deadline has passed. • All tender related information is held in a central database, which should be easily searchable and fully audited, with all activities recorded. • It is essential that tender documents are not read or submitted by unauthorised parties. • Users of the e-Tender system are to be properly identified and registered via controlled access. In simple terms, security has to be as good as if not better than a manual tender process. Data is to be encrypted and users authenticated by means such as digital signatures, electronic certificates or smartcards. • All parties must be assured that no 'undetected' alterations can be made to any tender. • The tenderer should be able to amend the bid right up to the deadline – whilst the client/purchaser cannot obtain access until the submission deadline has passed. • The e-Tender system may also include features such as a database of service providers with spreadsheet-based pricing schedules, which can make it easier for a potential tenderer to electronically prepare and analyse a tender. Research indicates the efficiency of an e-Tender process is well supported internationally, with a significant number, yet similar, e-Tender benefits identified during this investigation. Both construction industry and Government participants generally agree that the implementation of an automated e-Tendering process or system enhances the overall quality, timeliness and cost-effectiveness of a tender process, and provides a more streamlined method of receiving, managing, and submitting tender documents than the traditional paper-based process. On the other hand, whilst there are undoubtedly many more barriers challenging the successful implementation and adoption of an e-Tendering system or process, researchers have also identified a range of challenges and perceptions that seem to hinder the uptake of this innovative approach to tendering electronically. A central concern seems to be that of security - when industry organisations have to use the Internet for electronic information transfer. As a result, when it comes to e-Tendering, industry participants insist these innovative tendering systems are developed to ensure the utmost security and integrity. Finally, if Australian organisations continue to explore the competitive ‘dynamics’ of the construction industry, without realising the current and future, trends and benefits of adopting innovative processes, such as e-Tendering, it will limit their globalising opportunities to expand into overseas markets and allow the continuation of international firms successfully entering local markets. As such, researchers believe increased knowledge, awareness and successful implementation of innovative systems and processes raises great expectations regarding their contribution towards ‘stimulating’ the globalisation of electronic procurement activities, and improving overall business and project performances throughout the construction industry sectors and overall marketplace (NSW Government 2002; Harty C. 2003; Murray M. 2003; Pietroforte R. 2003). Achieving the successful integration of an innovative e-Tender solution with an existing / traditional process can be a complex, and if not done correctly, could lead to failure (Bourn J. 2002).
Resumo:
Query reformulation is a key user behavior during Web search. Our research goal is to develop predictive models of query reformulation during Web searching. This article reports results from a study in which we automatically classified the query-reformulation patterns for 964,780 Web searching sessions, composed of 1,523,072 queries, to predict the next query reformulation. We employed an n-gram modeling approach to describe the probability of users transitioning from one query-reformulation state to another to predict their next state. We developed first-, second-, third-, and fourth-order models and evaluated each model for accuracy of prediction, coverage of the dataset, and complexity of the possible pattern set. The results show that Reformulation and Assistance account for approximately 45% of all query reformulations; furthermore, the results demonstrate that the first- and second-order models provide the best predictability, between 28 and 40% overall and higher than 70% for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance.
Resumo:
This paper reports findings from a study investigating the effect of integrating sponsored and nonsponsored search engine links into a single web listing. The premise underlying this research is that web searchers are chiefly interested in relevant results. Given the reported negative bias that web searchers have concerning sponsored links, separate listings may be a disservice to web searchers as it might not direct them to relevant websites. Some web meta-search engines integrate sponsored and nonsponsored links into a single listing. Using a web search engine log of over 7 million interactions from hundreds of thousands of users from a major web meta-search engine, we analysed the click-through patterns for both sponsored and nonsponsored links. We also classified web queries as informational, navigational and transactional based on the expected type of content and analysed the click-through patterns of each classification. The findings show that for more than 35% of queries, there are no clicks on any result. More than 80% of web queries are informational in nature and approximately 10% are transactional, and 10% navigational. Sponsored links account for approximately 15% of all clicks. Integrating sponsored and nonsponsored links does not appear to increase the clicks on sponsored listings. We discuss how these research results could enhance future sponsored search platforms.