951 resultados para Digitized Collections
Resumo:
Increasingly scientists are using collections of software tools in their research. These tools are typically used in concert, often necessitating laborious and error-prone manual data reformatting and transfer. We present an intuitive workflow environment to support scientists with their research. The workflow, GPFlow, wraps legacy tools, presenting a high level, interactive web-based front end to scientists. The workflow backend is realized by a commercial grade workflow engine (Windows Workflow Foundation). The workflow model is inspired by spreadsheets and is novel in its support for an intuitive method of interaction enabling experimentation as required by many scientists, e.g. bioinformaticians. We apply GPFlow to two bioinformatics experiments and demonstrate its flexibility and simplicity.
Resumo:
The Wikipedia has become the most popular online source of encyclopedic information. The English Wikipedia collection, as well as some other languages collections, is extensively linked. However, as a multilingual collection the Wikipedia is only very weakly linked. There are few cross-language links or cross-dialect links (see, for example, Chinese dialects). In order to link the multilingual-Wikipedia as a single collection, automated cross language link discovery systems are needed – systems that identify anchor-texts in one language and targets in another. The evaluation of Link Discovery approaches within the English version of the Wikipedia has been examined in the INEX Link the-Wiki track since 2007, whilst both CLEF and NTCIR emphasized the investigation and the evaluation of cross-language information retrieval. In this position paper we propose a new virtual evaluation track: Cross Language Link Discovery (CLLD). The track will initially examine cross language linking of Wikipedia articles. This virtual track will not be tied to any one forum; instead we hope it can be connected to each of (at least): CLEF, NTCIR, and INEX as it will cover ground currently studied by each. The aim is to establish a virtual evaluation environment supporting continuous assessment and evaluation, and a forum for the exchange of research ideas. It will be free from the difficulties of scheduling and synchronizing groups of collaborating researchers and alleviate the necessity to travel across the globe in order to share knowledge. We aim to electronically publish peer-reviewed publications arising from CLLD in a similar fashion: online, with open access, and without fixed submission deadlines.
Resumo:
Language Modeling (LM) has been successfully applied to Information Retrieval (IR). However, most of the existing LM approaches only rely on term occurrences in documents, queries and document collections. In traditional unigram based models, terms (or words) are usually considered to be independent. In some recent studies, dependence models have been proposed to incorporate term relationships into LM, so that links can be created between words in the same sentence, and term relationships (e.g. synonymy) can be used to expand the document model. In this study, we further extend this family of dependence models in the following two ways: (1) Term relationships are used to expand query model instead of document model, so that query expansion process can be naturally implemented; (2) We exploit more sophisticated inferential relationships extracted with Information Flow (IF). Information flow relationships are not simply pairwise term relationships as those used in previous studies, but are between a set of terms and another term. They allow for context-dependent query expansion. Our experiments conducted on TREC collections show that we can obtain large and significant improvements with our approach. This study shows that LM is an appropriate framework to implement effective query expansion.
Resumo:
Information has no value unless it is accessible. Information must be connected together so a knowledge network can then be built. Such a knowledge base is a key resource for Internet users to interlink information from documents. Information retrieval, a key technology for knowledge management, guarantees access to large corpora of unstructured text. Collaborative knowledge management systems such as Wikipedia are becoming more popular than ever; however, their link creation function is not optimized for discovering possible links in the collection and the quality of automatically generated links has never been quantified. This research begins with an evaluation forum which is intended to cope with the experiments of focused link discovery in a collaborative way as well as with the investigation of the link discovery application. The research focus was on the evaluation strategy: the evaluation framework proposal, including rules, formats, pooling, validation, assessment and evaluation has proved to be efficient, reusable for further extension and efficient for conducting evaluation. The collection-split approach is used to re-construct the Wikipedia collection into a split collection comprising single passage files. This split collection is proved to be feasible for improving relevant passages discovery and is devoted to being a corpus for focused link discovery. Following these experiments, a mobile client-side prototype built on iPhone is developed to resolve the mobile Search issue by using focused link discovery technology. According to the interview survey, the proposed mobile interactive UI does improve the experience of mobile information seeking. Based on this evaluation framework, a novel cross-language link discovery proposal using multiple text collections is developed. A dynamic evaluation approach is proposed to enhance both the collaborative effort and the interacting experience between submission and evaluation. A realistic evaluation scheme has been implemented at NTCIR for cross-language link discovery tasks.
Resumo:
We present a novel, web-accessible scientific workflow system which makes large-scale comparative studies accessible without programming or excessive configuration requirements. GPFlow allows a workflow defined on single input values to be automatically lifted to operate over collections of input values and supports the formation and processing of collections of values without the need for explicit iteration constructs. We introduce a new model for collection processing based on key aggregation and slicing which guarantees processing integrity and facilitates automatic association of inputs, allowing scientific users to manage the combinatorial explosion of data values inherent in large scale comparative studies. The approach is demonstrated using a core task from comparative genomics, and builds upon our previous work in supporting combined interactive and batch operation, through a lightweight web-based user interface.
Resumo:
Australia’s mass market fashion labels have traditionally benefitted from their peripheral location to the world’s fashion centres. Operating a season behind, Australian mass market designers and buyers were well-placed to watch trends play out overseas before testing them in the Australian marketplace. For this reason, often a designer’s role was to source and oversee the manufacture of ‘knock-offs’, or close copies of Northern hemisphere mass market garments. Both Weller (2007) and Walsh (2009) have commented on this practice. The knock-on effect from this continues to be a cautious, derivative fashion sensibility within Australian mass market fashion design, where any new trend or product is first tested and proved overseas months earlier. However, there is evidence that this is changing. The rapid online dissemination of global fashion trends, coupled with the Australian consumer’s willingness to shop online, has meant that the ‘knock-off’ is less viable. For this reason, a number of mass market companies are moving away from the practice of direct sourcing and are developing product in-house under a Northern hemisphere model. This shift is also witnessed in the trend for mass market companies to develop collections in partnership with independent Australian designers. This paper explores the current and potential effects of these shifts within Australian mass market design practice, and discusses how they may impact on designers, consumers and on the wider culture of Australian fashion.
Resumo:
As business process management technology matures, organisations acquire more and more business process models. The resulting collections can consist of hundreds, even thousands of models and their management poses real challenges. One of these challenges concerns model retrieval where support should be provided for the formulation and efficient execution of business process model queries. As queries based on only structural information cannot deal with all querying requirements in practice, there should be support for queries that require knowledge of process model semantics. In this paper we formally define a process model query language that is based on semantic relationships between tasks. This query language is independent of the particular process modelling notation used, but we will demonstrate how it can be used in the context of Petri nets by showing how the semantic relationships can be determined for these nets in such a way that state space explosion is avoided as much as possible. An experiment with three large process model repositories shows that queries expressed in our language can be evaluated efficiently.
Resumo:
University libraries play an important role in contributing to student and faculty members’ academic achievement. This study examines perceptions of university library usage to consider factors that influence achievement of students, academics and administrators. A thorough review of relevant literature examined approaches to determining user satisfaction of students and faculty, and factors that influence library usage. It highlighted the value of usage on educational performance. It enabled development of a theoretical framework leading to the Factors of Academic Library Usage (FALU) model, which was developed to investigate the effect of usage factors. FALU was tested in Kuwait university libraries. The study used validated questionnaires from 792 students, 143 academics and 121 administrators to measure five library factors. Interviews were conducted across the three University libraries. The findings are useful in measuring the correlation between the current academic library usage and educational performance.
Resumo:
Bystander is a multi-user, immersive, interactive environment intended for public display in a museum or art gallery. It is designed to make available heritage collections in novel and culturally responsible ways. We use its development as a case study to examine the role played in that process by a range of tools and techniques from participatory design traditions. We describe how different tools were used within the design process, specifically: the ways in which the potential audience members were both included and represented; the prototypes that have been constructed as a way of envisioning how the final work might be experienced; and how these tools have been brought together in ongoing designing and evaluation. We close the paper with some reflections on the extension of participatory commitments into still-emerging areas of technology design that prioritise the design of spaces for human experience and reflective interaction.
Resumo:
Purpose – To investigate and identify the patterns of interaction between searchers and search engine during web searching. Design/methodology/approach – The authors examined 2,465,145 interactions from 534,507 users of Dogpile.com submitted on May 6, 2005, and compared query reformulation patterns. They investigated the type of query modifications and query modification transitions within sessions. Findings – The paper identifies three strong query reformulation transition patterns: between specialization and generalization; between video and audio, and between content change and system assistance. In addition, the findings show that web and images content were the most popular media collections. Originality/value – This research sheds light on the more complex aspects of web searching involving query modifications.
Resumo:
The aim of this study was to determine whether spatiotemporal interactions between footballers and the ball in 1 vs. 1 sub-phases are influenced by their proximity to the goal area. Twelve participants (age 15.3 ± 0.5 years) performed as attackers and defenders in 1 vs. 1 dyads across three field positions: (a) attacking the goal, (b) in midfield, and (c) advancing away from the goal area. In each position, the dribbler was required to move beyond an immediate defender with the ball towards the opposition goal. Interactions of attacker-defender dyads were filmed with player and ball displacement trajectories digitized using manual tracking software. One-way repeated measures analysis of variance was used to examine differences in mean defender-to-ball distance after this value had stabilized. Maximum attacker-to-ball distance was also compared as a function of proximity-to-goal. Significant differences were observed for defender-to-ball distance between locations (a) and (c) at the moment when the defender-to-ball distance had stabilized (a: 1.69 ± 0.64 m; c: 1.15 ± 0.59 m; P < 0.05). Findings indicate that proximity-to-goal influenced the performance of players, particularly when attacking or advancing away from goal areas, providing implications for training design in football. In this study, the task constraints of football revealed subtly different player interactions than observed in previous studies of dyadic systems in basketball and rugby union.
Resumo:
Given significant government attention to, and expenditure on, Indigenous equity in Australia, this article addresses a core problem: the lack of a sound understanding of Indigenous social attitudes and priorities. An account of cultural theory raises the likelihood of difference in outlook between Indigenous and non-Indigenous people, including those making and implementing policy. Yet, years of scholarly research and official statistical collections have overlooked potentially critical aspects of Indigineity. Suggestions of difference emerge from reference to the 2007 Australian Survey of Social Attitudes (AuSSA). If the attitudes recorded a small sample in this instrument manifest in the Indigenous population at large, policy priorities and directions should be reviewed and possibly revised. Despite inherent methodological difficulties, the article calls for targeted social attitude research among Australia's Indigenous peoples so that future policy can be better oriented and calibrated. The national benefits would outweigh the costs via better directed policy making.
Resumo:
In information retrieval (IR) research, more and more focus has been placed on optimizing a query language model by detecting and estimating the dependencies between the query and the observed terms occurring in the selected relevance feedback documents. In this paper, we propose a novel Aspect Language Modeling framework featuring term association acquisition, document segmentation, query decomposition, and an Aspect Model (AM) for parameter optimization. Through the proposed framework, we advance the theory and practice of applying high-order and context-sensitive term relationships to IR. We first decompose a query into subsets of query terms. Then we segment the relevance feedback documents into chunks using multiple sliding windows. Finally we discover the higher order term associations, that is, the terms in these chunks with high degree of association to the subsets of the query. In this process, we adopt an approach by combining the AM with the Association Rule (AR) mining. In our approach, the AM not only considers the subsets of a query as “hidden” states and estimates their prior distributions, but also evaluates the dependencies between the subsets of a query and the observed terms extracted from the chunks of feedback documents. The AR provides a reasonable initial estimation of the high-order term associations by discovering the associated rules from the document chunks. Experimental results on various TREC collections verify the effectiveness of our approach, which significantly outperforms a baseline language model and two state-of-the-art query language models namely the Relevance Model and the Information Flow model
Resumo:
This paper describes the evaluation in benchmarking the effectiveness of cross-lingual link discovery (CLLD). Cross lingual link discovery is a way of automatically finding prospective links between documents in different languages, which is particularly helpful for knowledge discovery of different language domains. A CLLD evaluation framework is proposed for system performance benchmarking. The framework includes standard document collections, evaluation metrics, and link assessment and evaluation tools. The evaluation methods described in this paper have been utilised to quantify the system performance at NTCIR-9 Crosslink task. It is shown that using the manual assessment for generating gold standard can deliver a more reliable evaluation result.
Resumo:
Cities accumulate and distribute vast sets of digital information. Many decision-making and planning processes in councils, local governments and organisations are based on both real-time and historical data. Until recently, only a small, carefully selected subset of this information has been released to the public – usually for specific purposes (e.g. train timetables, release of planning application through websites to name just a few). This situation is however changing rapidly. Regulatory frameworks, such as the Freedom of Information Legislation in the US, the UK, the European Union and many other countries guarantee public access to data held by the state. One of the results of this legislation and changing attitudes towards open data has been the widespread release of public information as part of recent Government 2.0 initiatives. This includes the creation of public data catalogues such as data.gov.au (U.S.), data.gov.uk (U.K.), data.gov.au (Australia) at federal government levels, and datasf.org (San Francisco) and data.london.gov.uk (London) at municipal levels. The release of this data has opened up the possibility of a wide range of future applications and services which are now the subject of intensified research efforts. Previous research endeavours have explored the creation of specialised tools to aid decision-making by urban citizens, councils and other stakeholders (Calabrese, Kloeckl & Ratti, 2008; Paulos, Honicky & Hooker, 2009). While these initiatives represent an important step towards open data, they too often result in mere collections of data repositories. Proprietary database formats and the lack of an open application programming interface (API) limit the full potential achievable by allowing these data sets to be cross-queried. Our research, presented in this paper, looks beyond the pure release of data. It is concerned with three essential questions: First, how can data from different sources be integrated into a consistent framework and made accessible? Second, how can ordinary citizens be supported in easily composing data from different sources in order to address their specific problems? Third, what are interfaces that make it easy for citizens to interact with data in an urban environment? How can data be accessed and collected?