986 resultados para Semistructured documents
Resumo:
Relevance Feedback (RF) has been proven very effective for improving retrieval accuracy. Adaptive information filtering (AIF) technology has benefited from the improvements achieved in all the tasks involved over the last decades. A difficult problem in AIF has been how to update the system with new feedback efficiently and effectively. In current feedback methods, the updating processes focus on updating system parameters. In this paper, we developed a new approach, the Adaptive Relevance Features Discovery (ARFD). It automatically updates the system's knowledge based on a sliding window over positive and negative feedback to solve a nonmonotonic problem efficiently. Some of the new training documents will be selected using the knowledge that the system currently obtained. Then, specific features will be extracted from selected training documents. Different methods have been used to merge and revise the weights of features in a vector space. The new model is designed for Relevance Features Discovery (RFD), a pattern mining based approach, which uses negative relevance feedback to improve the quality of extracted features from positive feedback. Learning algorithms are also proposed to implement this approach on Reuters Corpus Volume 1 and TREC topics. Experiments show that the proposed approach can work efficiently and achieves the encouragement performance.
Resumo:
Most Australian states have introduced legislation to provide for enduring documents for financial, personal and health care decision making in the event of incapacity. Since the introduction of Enduring Powers of Attorney (EPAs) and Advance Health Directives (AHDs) in Queensland in 1998, concerns have continued to be raised by service providers, professionals and individuals about the uptake, understanding and appropriate use of these documents. In response to these concerns, the Department of Justice and Attorney-General (DJAG) convened a Practical Guardianship Initiatives Working Party. This group identified the limited evidence base available to address these concerns. In 2009, a multidisciplinary research team from the University of Queensland and the Queensland University of Technology was awarded $90,000 from the Legal Practitioners Interest on Trust Account Fund to undertake a review of the current EPA and AHD forms. The goal of the research was to gather data on the content and useability of the forms from the perspectives of a range of stakeholders, particularly those completing the EPA and AHD, witnesses of these documents, attorneys appointed under an EPA, and health professionals involved in the completion of an AHD or dealing with it in a clinical context. The researchers also sought to gather information from the perspective of Aboriginal and Torres Strait Islander (ATSI) individuals as well people from culturally and linguistically diverse (CALD) groups. Although the focus of the research was on the forms and the extent to which the current design, content and format represents a barrier to uptake, in the course of the research, some broader issues were identified which have an impact on the effectiveness of the EPA and AHD in achieving the goals of planning for financial and personal and health care in advance of losing capacity. The data gathered enabled the researchers to achieve the primary goal of the research: to make recommendations to improve the content and useability of the forms which hopefully will lead to an increased uptake and appropriate use of the forms. However, the researchers thought it was important not to ignore broader policy issues that were identified in the course of the research. These broader issues have been highlighted in this Report, and the researchers have responded to them in a variety of ways. For some issues, the researchers have suggested alterations that could be made to the forms to address the particular concerns. For other issues, the researchers have suggested that Government may need to take specific action such as educating the broader community with some attention to strategies that engage particular groups within communities. Other concerns raised can only be dealt with by legislative reform and, in some of these cases, the researchers have identified issues that Government may wish to consider further. We do note, however, that it is beyond the scope of this Report to recommend changes to the law. This three stage mixed methods project aimed to provide systematic evidence from a broad range of stakeholders in regard to: (i) which groups use and do not use these documents and why, (ii) the contribution of the length/complexity/format/language of the forms as barriers to their completion and/or effective use, and (iii) the issues raised by the current documents for witnesses and attorneys. Understanding and use of EPAs and AHDs were generally explored in separate but parallel processes. A purposive sampling strategy included users of the documents as principals and attorneys, and professionals, witnesses and service providers who assist others to execute or use the forms. The first component of this study built on existing knowledge using a Critical Reference Group and material provided by the DJAG Practical Guardianship Initiatives Working Party. This assisted in the development of the data collection tools for subsequent stages. The second component comprised semi-structured interviews and focus groups with a targeted sample of current users of the forms, potential users, witnesses and other professionals to provide in-depth information on critical issues. Outreach to Aboriginal and Torres Strait Islander Elders and individuals and workers with CALD groups ensured a broad sample of potential users of the two documents. Fifty individual interviews and three focus groups were completed. Most interviews and focus groups focused on perceptions of, and experiences with, either the EPA or the AHD form. In the interviews with Indigenous people and the CALD focus groups, however, respondents provided their perceptions and experiences of both documents. In general, these respondents had not used the forms and were responding to the documents made available in the interview or focus group. In total, seventy-seven individuals were involved in interviews or focus groups. The final component comprised on-line surveys for EPA principals, EPA attorneys, AHD principals, witnesses of EPAs and AHDs and medical practitioners with experience of AHDs as nominated and/or treating doctors. The surveys were developed from the initial component and the qualitative analysis of the interview and focus group data. A total of 116 surveys were returned from major cities and regional Queensland. The survey data was analysed descriptively for patterns and trends. It is important to note that the aim of the survey was to gain insight into issues and concerns relating to the documents and not to make generalisations to the broader population.
Resumo:
With the growing number of XML documents on theWeb it becomes essential to effectively organise these XML documents in order to retrieve useful information from them. A possible solution is to apply clustering on the XML documents to discover knowledge that promotes effective data management, information retrieval and query processing. However, many issues arise in discovering knowledge from these types of semi-structured documents due to their heterogeneity and structural irregularity. Most of the existing research on clustering techniques focuses only on one feature of the XML documents, this being either their structure or their content due to scalability and complexity problems. The knowledge gained in the form of clusters based on the structure or the content is not suitable for reallife datasets. It therefore becomes essential to include both the structure and content of XML documents in order to improve the accuracy and meaning of the clustering solution. However, the inclusion of both these kinds of information in the clustering process results in a huge overhead for the underlying clustering algorithm because of the high dimensionality of the data. The overall objective of this thesis is to address these issues by: (1) proposing methods to utilise frequent pattern mining techniques to reduce the dimension; (2) developing models to effectively combine the structure and content of XML documents; and (3) utilising the proposed models in clustering. This research first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. A clustering framework with two types of models, implicit and explicit, is developed. The implicit model uses a Vector Space Model (VSM) to combine the structure and the content information. The explicit model uses a higher order model, namely a 3- order Tensor Space Model (TSM), to explicitly combine the structure and the content information. This thesis also proposes a novel incremental technique to decompose largesized tensor models to utilise the decomposed solution for clustering the XML documents. The proposed framework and its components were extensively evaluated on several real-life datasets exhibiting extreme characteristics to understand the usefulness of the proposed framework in real-life situations. Additionally, this research evaluates the outcome of the clustering process on the collection selection problem in the information retrieval on the Wikipedia dataset. The experimental results demonstrate that the proposed frequent pattern mining and clustering methods outperform the related state-of-the-art approaches. In particular, the proposed framework of utilising frequent structures for constraining the content shows an improvement in accuracy over content-only and structure-only clustering results. The scalability evaluation experiments conducted on large scaled datasets clearly show the strengths of the proposed methods over state-of-the-art methods. In particular, this thesis work contributes to effectively combining the structure and the content of XML documents for clustering, in order to improve the accuracy of the clustering solution. In addition, it also contributes by addressing the research gaps in frequent pattern mining to generate efficient and concise frequent subtrees with various node relationships that could be used in clustering.
Resumo:
In Bowenbrae Pty Ltd v Flying Fighters Maintenance and Restoration [2010] QDC 347 Reid DCJ made orders requiring the plaintiffs to make application under the Freedom of Information Act 1982 (Cth) (“the FOI Act”) for documents sought by the defendant.
Resumo:
Existing macro level research on the new venture creation process recognises the entrepreneur as a central agent in the process yet generally avoids, at each stage of the process, an examination of the micro level psychological behaviour of the individual entrepreneur. By integrating two theoretical approaches to entrepreneurship research, the psychology of the entrepreneur and the entrepreneurship process, this paper examines, using content analysis, the language used by new venture founders in documents directly linked to their capital raising activity. The study examined the language of 108 offer documents (information memorandum’s) which were divided between 54 new ventures that were successful in raising capital and 54 new ventures that either did not proceed further or were not successful in raising capital through the Australian Small Scale Offerings Board. Specifically, we were interested in examining the level of optimism evident in these narratives given that entrepreneurs have been previously described in the literature as being excessively optimistic.
Resumo:
Many existing information retrieval models do not explicitly take into account in- formation about word associations. Our approach makes use of rst and second order relationships found in natural language, known as syntagmatic and paradigmatic associ- ations, respectively. This is achieved by using a formal model of word meaning within the query expansion process. On ad hoc retrieval, our approach achieves statistically sig- ni cant improvements in MAP (0.158) and P@20 (0.396) over our baseline model. The ERR@20 and nDCG@20 of our system was 0.249 and 0.192 respectively. Our results and discussion suggest that information about both syntagamtic and paradigmatic associa- tions can assist with improving retrieval eectiveness on ad hoc retrieval.
Resumo:
Many existing information retrieval models do not explicitly take into account in- formation about word associations. Our approach makes use of rst and second order relationships found in natural language, known as syntagmatic and paradigmatic associ- ations, respectively. This is achieved by using a formal model of word meaning within the query expansion process. On ad hoc retrieval, our approach achieves statistically sig- ni cant improvements in MAP (0.158) and P@20 (0.396) over our baseline model. The ERR@20 and nDCG@20 of our system was 0.249 and 0.192 respectively. Our results and discussion suggest that information about both syntagamtic and paradigmatic associa- tions can assist with improving retrieval eectiveness on ad hoc retrieval.
Resumo:
In Hare v Mount Isa City Council [2009] QDC 39 McGill DCJ examined the scope of s 27(1) of the Personal Injuries Proceedings Act 2002 (Qld) and its interpretation by the Court of Appeal in Haug v Jupiters Ltd [2008] 1 Qd R 276. The judge expressed a number of concerns about the Act and the Regulation made under it, that are worthy of consideration by the Legislature.
Resumo:
In John Kallinicos Accountants Pty Ltd v Dundrenan Pty Ltd [2009] QDC 141 Irwin DCJ considered the nature of a party’s obligation under r 222 of the Uniform Civil Procedure Rules 1999 (Qld) (UCPR) to produce documents referred to in the parties’ pleadings, particulars or affidavits. The decision examined whether the approach in Belela Pty Ltd v Menzies Excavation Pty Ltd [2005] 2 QdR 230 in relation to disclosure of documents under UCPR r 214 also applied to production of documents under r 222.
Resumo:
In Deppro Pty Ltd v Hannah [2008] QSC 193 one of the matters considered by the court related to the requirement in r 243 of the Uniform Civil Procedure Rules 1999 (Qld) that a notice of non-party disclosure must “state the allegation in issue in the pleadings about which the document sought is directly relevant.”The approach adopted by the issuing party in this case of asserting that documents sought by a notice of non-party disclosure are relevant to allegations in numbered paragraphs in pleadings, and serving copies of the pleadings with the notice, is not uncommon in practice. This decision makes it clear that this practice is fraught with danger. In circumstances where it is not apparent that the non-party has been fully apprised of the relevant issues the decision suggests an applicant for non-party disclosure who has not complied with the requirements of s 243 might be required to issue a fresh, fully compliant notice, and to suffer associated costs consequences.
Resumo:
Enterprise Systems (ES) provide standardized, off-theshelf support for operations and management within organizations. With the advent of ES based on a serviceoriented architecture (SOA) and an increasing demand of IT-supported interorganizational collaboration, implementation projects face paradigmatically new challenges. The configuration of ES is costly and error-prone. Dependencies between business processes and business documents are hardly explicit and foster component proliferation instead of reuse. Configurative modeling can support the problem in two ways: First, conceptual modeling abstracts from technical details and provides more intuitive access and overview. Second, configuration allows the projection of variants from master models providing manageable variants with controlled flexibility. We aim at tackling the problem by proposing an integrated model-based framework for configuring both, processes and business documents, on an equal basis; as together, they constitute the core business components of an ES.
Resumo:
We propose a cluster ensemble method to map the corpus documents into the semantic space embedded in Wikipedia and group them using multiple types of feature space. A heterogeneous cluster ensemble is constructed with multiple types of relations i.e. document-term, document-concept and document-category. A final clustering solution is obtained by exploiting associations between document pairs and hubness of the documents. Empirical analysis with various real data sets reveals that the proposed meth-od outperforms state-of-the-art text clustering approaches.
Resumo:
This article considers from an Australian perspective the impediments that copyright law places in the path of those who seek to use patent specifications and non-patent prior art documents in ways that are necessary to the proper functioning of the patent system. Until recently, copyright law in Australia had limited the uses to which members of the public could put patent specifications in that country. Those impediments have been removed as a result of an important legislative change to the way in which copyright in patent specifications can be enforced. The change gives the public a greater freedom to make use of patent specifications than it enjoyed before, and removes unwarranted restrictions upon the ways in which the public can reuse valuable information. However, what the amendment does not address is the impediments copyright imposes on using non-patent prior art documents in ways that advance the public interest.
Resumo:
One of the recent Raising the Bar amendments has removed impediments imposed by copyright law that may have limited the uses to which IP Australia and members of the public could have lawfully put patent specifications without seeking permission from the copyright owner. What the amendment does not do, however, is extend the same protections to those who wish to use prior art documents in ways that benefit the patent system and further the public interest.
Resumo:
A known limitation of the Probability Ranking Principle (PRP) is that it does not cater for dependence between documents. Recently, the Quantum Probability Ranking Principle (QPRP) has been proposed, which implicitly captures dependencies between documents through “quantum interference”. This paper explores whether this new ranking principle leads to improved performance for subtopic retrieval, where novelty and diversity is required. In a thorough empirical investigation, models based on the PRP, as well as other recently proposed ranking strategies for subtopic retrieval (i.e. Maximal Marginal Relevance (MMR) and Portfolio Theory(PT)), are compared against the QPRP. On the given task, it is shown that the QPRP outperforms these other ranking strategies. And unlike MMR and PT, one of the main advantages of the QPRP is that no parameter estimation/tuning is required; making the QPRP both simple and effective. This research demonstrates that the application of quantum theory to problems within information retrieval can lead to significant improvements.