967 resultados para Cartes-Col·lections


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Traditional information retrieval (IR) systems respond to user queries with ranked lists of relevant documents. The separation of content and structure in XML documents allows individual XML elements to be selected in isolation. Thus, users expect XML-IR systems to return highly relevant results that are more precise than entire documents. In this paper we describe the implementation of a search engine for XML document collections. The system is keyword based and is built upon an XML inverted file system. We describe the approach that was adopted to meet the requirements of Content Only (CO) and Vague Content and Structure (VCAS) queries in INEX 2004.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The advantages of bundling e-journals together into publisher collections include increased access to information for the subscribing institution’s clients, purchasing cost-effectiveness and streamlined workflows. Whilst cataloguing a consortial e-journal collection has its advantages, there are also various pitfalls and the author outlines efforts by the CAUL (Council of Australian University Libraries) Consortium libraries to further streamline this process, working in conjunction with major publishers. Despite the advantages that publisher collections provide, pressures to unbundle existing packages continue to build, fuelled by an ever-increasing selection of available electronic resources; decreases in, and competing demands upon, library budgets; the impact of currency fluctuations; and poor usage for an alarmingly high proportion of collection titles. Consortial perspectives on bundling and unbundling titles are discussed, including options for managing the addition of new titles to the bundle and why customising consortial collections currently does not work. Unbundling analyses carried out at Queensland University of Technology during 2006 to 2008 prior to the renewal of several major publisher collections are presented as further case studies which illustrate why the “big deal” continues to persist.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The picturesque aesthetic in the work of Sir John Soane, architect and collector, resonates in the major work of his very personal practice – the development of his house museum, now the Soane Museum in Lincoln’s Inn Fields in London. Soane was actively involved with the debates, practices and proponents of picturesque and classical practices in architecture and landscape and his lectures reveal these influences in the making of The Soane, which was built to contain and present diverse collections of classical and contemporary art and architecture alongside scavenged curiosities. The Soane Museum has been described as a picturesque landscape, where a pictorial style, together with a carefully defined itinerary, has resulted in the ‘apotheosis of the Picturesque interior’. Soane also experimented with making mock ruinscapes within gardens, which led him to construct faux architectures alluding to archaeological practices based upon the ruin and the fragment. These ideas framed the making of interior landscapes expressed through spatial juxtapositions of room and corridor furnished with the collected object that characterise The Soane Museum. This paper is a personal journey through the Museum which describes and then reviews aspects of Soane’s work in the context of contemporary theories on ‘new’ museology. It describes the underpinning picturesque practices that Soane employed to exceed the boundaries between interior and exterior landscapes and the collection. It then applies particular picturesque principles drawn from visiting The Soane to a speculative project for a house/landscape museum for the Oratunga historic property in outback South Australia, where the often, normalising effects of conservation practices are reviewed using minimal architectural intervention through a celebration of ruinous states.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper explores how people communicate in reference to local interests and suggests information and communication technology (ICT) design for enhancement of local community networks. Qualitative data was gathered from participant observations of local community collective action and open interviews with active community members. Data analysis revealed concepts, leading to categories in relation to local interactions and interests. Design suggestions consider introducing people to local community private-strategic activity via public displays that indicate simple entry points to active participation, and creating information collections according to local community perspectives for long-term reference.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Search engines have forever changed the way people access and discover knowledge, allowing information about almost any subject to be quickly and easily retrieved within seconds. As increasingly more material becomes available electronically the influence of search engines on our lives will continue to grow. This presents the problem of how to find what information is contained in each search engine, what bias a search engine may have, and how to select the best search engine for a particular information need. This research introduces a new method, search engine content analysis, in order to solve the above problem. Search engine content analysis is a new development of traditional information retrieval field called collection selection, which deals with general information repositories. Current research in collection selection relies on full access to the collection or estimations of the size of the collections. Also collection descriptions are often represented as term occurrence statistics. An automatic ontology learning method is developed for the search engine content analysis, which trains an ontology with world knowledge of hundreds of different subjects in a multilevel taxonomy. This ontology is then mined to find important classification rules, and these rules are used to perform an extensive analysis of the content of the largest general purpose Internet search engines in use today. Instead of representing collections as a set of terms, which commonly occurs in collection selection, they are represented as a set of subjects, leading to a more robust representation of information and a decrease of synonymy. The ontology based method was compared with ReDDE (Relevant Document Distribution Estimation method for resource selection) using the standard R-value metric, with encouraging results. ReDDE is the current state of the art collection selection method which relies on collection size estimation. The method was also used to analyse the content of the most popular search engines in use today, including Google and Yahoo. In addition several specialist search engines such as Pubmed and the U.S. Department of Agriculture were analysed. In conclusion, this research shows that the ontology based method mitigates the need for collection size estimation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Peer to peer systems have been widely used in the internet. However, most of the peer to peer information systems are still missing some of the important features, for example cross-language IR (Information Retrieval) and collection selection / fusion features. Cross-language IR is the state-of-art research area in IR research community. It has not been used in any real world IR systems yet. Cross-language IR has the ability to issue a query in one language and receive documents in other languages. In typical peer to peer environment, users are from multiple countries. Their collections are definitely in multiple languages. Cross-language IR can help users to find documents more easily. E.g. many Chinese researchers will search research papers in both Chinese and English. With Cross-language IR, they can do one query in Chinese and get documents in two languages. The Out Of Vocabulary (OOV) problem is one of the key research areas in crosslanguage information retrieval. In recent years, web mining was shown to be one of the effective approaches to solving this problem. However, how to extract Multiword Lexical Units (MLUs) from the web content and how to select the correct translations from the extracted candidate MLUs are still two difficult problems in web mining based automated translation approaches. Discovering resource descriptions and merging results obtained from remote search engines are two key issues in distributed information retrieval studies. In uncooperative environments, query-based sampling and normalized-score based merging strategies are well-known approaches to solve such problems. However, such approaches only consider the content of the remote database but do not consider the retrieval performance of the remote search engine. This thesis presents research on building a peer to peer IR system with crosslanguage IR and advance collection profiling technique for fusion features. Particularly, this thesis first presents a new Chinese term measurement and new Chinese MLU extraction process that works well on small corpora. An approach to selection of MLUs in a more accurate manner is also presented. After that, this thesis proposes a collection profiling strategy which can discover not only collection content but also retrieval performance of the remote search engine. Based on collection profiling, a web-based query classification method and two collection fusion approaches are developed and presented in this thesis. Our experiments show that the proposed strategies are effective in merging results in uncooperative peer to peer environments. Here, an uncooperative environment is defined as each peer in the system is autonomous. Peer like to share documents but they do not share collection statistics. This environment is a typical peer to peer IR environment. Finally, all those approaches are grouped together to build up a secure peer to peer multilingual IR system that cooperates through X.509 and email system.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sugarcane orange rust, caused by Puccinia kuehnii, was once considered a minor disease in the Australian sugar industry. However, in 2000 a new race of the pathogen devastated the high-performing sugarcane cultivar Q124, and caused the industry Aus$150–210 million in yield losses. At the time of the epidemic, very little was known about the genetic and pathogenic diversity of the fungus in Australia and neighbouring sugar industries. DNA sequence data from three rDNA regions were used to determine the genetic relationships between isolates within two P. kuehnii collections. The first collection comprised only recent Australian field isolates and limited sequence variation was detected within this population. In the second study, Australian isolates were compared with isolates from Papua New Guinea, Indonesia, China and historical herbarium collections. Greater sequence variation was detected in this collection and phylogenetic analyses grouped the isolates into three clades. All isolates from commercial cane fields clustered together including the recent Australianfield isolates and the Australian historical isolate from 1898.The other two clades included rust isolates from wild and garden canes in Indonesia and PNG. These rusts appeared morphologically similar to P. kuehnii and could potentially pose a quarantine threat to the Australian sugar industry. The results have revealed greater diversity in sugarcane rusts than previously thought.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Aim To measure latitude-related body size variation in field-collected Paropsis atomaria Olivier (Coleoptera: Chrysomelidae) individuals and to conduct common-garden experiments to determine whether such variation is due to phenotypic plasticity or local adaptation. Location Four collection sites from the east coast of Australia were selected for our present field collections: Canberra (latitude 35°19' S), Bangalow (latitude 28°43' S), Beerburrum (latitude 26°58' S) and Lowmead (latitude 24°29' S). Museum specimens collected over the past 100 years and covering the same geographical area as the present field collections came from one state, one national and one private collection. Methods Body size (pronotum width) was measured for 118 field-collected beetles and 302 specimens from collections. We then reared larvae from the latitudinal extremes (Canberra and Lowmead) to determine whether the size cline was the result of phenotypic plasticity or evolved differences (= local adaptation) between sites. Results Beetles decreased in size with increasing latitude, representing a converse Bergmann cline. A decrease in developmental temperature produced larger adults for both Lowmead (low latitude) and Canberra (high latitude) individuals, and those from Lowmead were larger than those from Canberra when reared under identical conditions. Main conclusions The converse Bergmann cline in P. atomaria is likely to be the result of local adaptation to season length.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

What happens when patterns become all pervasive? When pattern contagiously corrupts and saturates adjacent objects, artefacts and surfaces; blurring internal and external environment and dissolving any single point of perspective or static conception of space. Mark Taylor ruminates on the possibilities of relentless patterning in interior space in both a historic and a contemporary context.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Over the years, people have often held the hypothesis that negative feedback should be very useful for largely improving the performance of information filtering systems; however, we have not obtained very effective models to support this hypothesis. This paper, proposes an effective model that use negative relevance feedback based on a pattern mining approach to improve extracted features. This study focuses on two main issues of using negative relevance feedback: the selection of constructive negative examples to reduce the space of negative examples; and the revision of existing features based on the selected negative examples. The former selects some offender documents, where offender documents are negative documents that are most likely to be classified in the positive group. The later groups the extracted features into three groups: the positive specific category, general category and negative specific category to easily update the weight. An iterative algorithm is also proposed to implement this approach on RCV1 data collections, and substantial experiments show that the proposed approach achieves encouraging performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cultural objects are increasingly generated and stored in digital form, yet effective methods for their indexing and retrieval still remain an important area of research. The main problem arises from the disconnection between the content-based indexing approach used by computer scientists and the description-based approach used by information scientists. There is also a lack of representational schemes that allow the alignment of the semantics and context with keywords and low-level features that can be automatically extracted from the content of these cultural objects. This paper presents an integrated approach to address these problems, taking advantage of both computer science and information science approaches. We firstly discuss the requirements from a number of perspectives: users, content providers, content managers and technical systems. We then present an overview of our system architecture and describe various techniques which underlie the major components of the system. These include: automatic object category detection; user-driven tagging; metadata transform and augmentation, and an expression language for digital cultural objects. In addition, we discuss our experience on testing and evaluating some existing collections, analyse the difficulties encountered and propose ways to address these problems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Opiine wasps (Hymenoptera: Braconidae: Opiinae) are parasitoids of dacine fruit flies (Diptera: Tephritidae: Dacinae), the primary horticultural pests of Australia and the South Pacific. Effective use of opiines for biological control of fruit flies is limited by poor taxonomy and identification difficulties. To overcome these problems, this thesis had two aims: (i) to carry out traditional taxonomic research on the fruit fly infesting opine braconids of Australia and the South Pacific; and (ii) to transfer the results of the taxonomic research into user friendly diagnostic tools. Curated wasp material was borrowed from all major Australian museum collections holding specimens. This was supplemented by a large body of material gathered as part of a major fruit fly project in Papua New Guinea: nearly 4000 specimens were examined and identified. Each wasp species was illustrated using traditional scientific drawings, full colour photomicroscopy and scanning electron microscopy. An electronic identification key was developed using Lucid software and diagnostic images were loaded on the web-based Pest and Diseases Image Library (PaDIL). A taxonomic synopsis and distribution and host records for each of the 15 species of dacine-parasitising opiine braconids found in the South Pacific is presented. Biosteres illusorius Fischer (1971) was formally transferred to the genus Fopius and a new species, Fopius ferrari Carmichael and Wharton (2005), was described. Other species dealt with were Diachasmimorpha hageni (Fullaway, 1952), D. kraussii (Fullaway, 1951), D. longicaudata (Ashmead, 1905), D. tryoni (Cameron, 1911), Fopius arisanus (Sonan, 1932), F. deeralensis (Fullaway, 1950), F. schlingeri Wharton (1999), Opius froggatti Fullaway (195), Psyttalia fijiensis (Fullaway, 1936), P. muesebecki (Fischer, 1963), P. novaguineensis (Szépliget, 1900i) and Utetes perkinsi (Fullaway, 1950). This taxonomic component of the thesis has been formally published in the scientific literature. An interactive diagnostics package (“OpiineID”) was developed, the centre of which is a Lucid based multi-access key. Because the diagnostics package is computer based, without the space limitations of the journal publication, there is no pictorial limit in OpiineID and so it is comprehensively illustrated with SEM photographs, full colour photographs, line drawings and fully rendered illustrations. The identification key is only one small component of OpiineID and the key is supported by fact sheets with morphological descriptions, host associations, geographical information and images. Each species contained within the OpiineID package has also been uploaded onto the PaDIL website (www.padil.gov.au). Because the identification of fruit fly parasitoids is largely of concern to fruit fly workers, rather than braconid specialists, this thesis deals directly with an area of growing importance to many areas of pure and applied biology; the nexus between taxonomy and diagnostics. The Discussion chapter focuses on this area, particularly the opportunities offered by new communication and information tools as new ways delivering the outputs of taxonomic science.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The increasing diversity of the Internet has created a vast number of multilingual resources on the Web. A huge number of these documents are written in various languages other than English. Consequently, the demand for searching in non-English languages is growing exponentially. It is desirable that a search engine can search for information over collections of documents in other languages. This research investigates the techniques for developing high-quality Chinese information retrieval systems. A distinctive feature of Chinese text is that a Chinese document is a sequence of Chinese characters with no space or boundary between Chinese words. This feature makes Chinese information retrieval more difficult since a retrieved document which contains the query term as a sequence of Chinese characters may not be really relevant to the query since the query term (as a sequence Chinese characters) may not be a valid Chinese word in that documents. On the other hand, a document that is actually relevant may not be retrieved because it does not contain the query sequence but contains other relevant words. In this research, we propose two approaches to deal with the problems. In the first approach, we propose a hybrid Chinese information retrieval model by incorporating word-based techniques with the traditional character-based techniques. The aim of this approach is to investigate the influence of Chinese segmentation on the performance of Chinese information retrieval. Two ranking methods are proposed to rank retrieved documents based on the relevancy to the query calculated by combining character-based ranking and word-based ranking. Our experimental results show that Chinese segmentation can improve the performance of Chinese information retrieval, but the improvement is not significant if it incorporates only Chinese segmentation with the traditional character-based approach. In the second approach, we propose a novel query expansion method which applies text mining techniques in order to find the most relevant words to extend the query. Unlike most existing query expansion methods, which generally select the highly frequent indexing terms from the retrieved documents to expand the query. In our approach, we utilize text mining techniques to find patterns from the retrieved documents that highly correlate with the query term and then use the relevant words in the patterns to expand the original query. This research project develops and implements a Chinese information retrieval system for evaluating the proposed approaches. There are two stages in the experiments. The first stage is to investigate if high accuracy segmentation can make an improvement to Chinese information retrieval. In the second stage, a text mining based query expansion approach is implemented and a further experiment has been done to compare its performance with the standard Rocchio approach with the proposed text mining based query expansion method. The NTCIR5 Chinese collections are used in the experiments. The experiment results show that by incorporating the text mining based query expansion with the hybrid model, significant improvement has been achieved in both precision and recall assessments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Information Retrieval is an important albeit imperfect component of information technologies. A problem of insufficient diversity of retrieved documents is one of the primary issues studied in this research. This study shows that this problem leads to a decrease of precision and recall, traditional measures of information retrieval effectiveness. This thesis presents an adaptive IR system based on the theory of adaptive dual control. The aim of the approach is the optimization of retrieval precision after all feedback has been issued. This is done by increasing the diversity of retrieved documents. This study shows that the value of recall reflects this diversity. The Probability Ranking Principle is viewed in the literature as the “bedrock” of current probabilistic Information Retrieval theory. Neither the proposed approach nor other methods of diversification of retrieved documents from the literature conform to this principle. This study shows by counterexample that the Probability Ranking Principle does not in general lead to optimal precision in a search session with feedback (for which it may not have been designed but is actively used). Retrieval precision of the search session should be optimized with a multistage stochastic programming model to accomplish the aim. However, such models are computationally intractable. Therefore, approximate linear multistage stochastic programming models are derived in this study, where the multistage improvement of the probability distribution is modelled using the proposed feedback correctness method. The proposed optimization models are based on several assumptions, starting with the assumption that Information Retrieval is conducted in units of topics. The use of clusters is the primary reasons why a new method of probability estimation is proposed. The adaptive dual control of topic-based IR system was evaluated in a series of experiments conducted on the Reuters, Wikipedia and TREC collections of documents. The Wikipedia experiment revealed that the dual control feedback mechanism improves precision and S-recall when all the underlying assumptions are satisfied. In the TREC experiment, this feedback mechanism was compared to a state-of-the-art adaptive IR system based on BM-25 term weighting and the Rocchio relevance feedback algorithm. The baseline system exhibited better effectiveness than the cluster-based optimization model of ADTIR. The main reason for this was insufficient quality of the generated clusters in the TREC collection that violated the underlying assumption.