800 resultados para Vischer, Peter, d. 1529.
Resumo:
Scientists need to transfer semantically similar queries across multiple heterogeneous linked datasets. These queries may require data from different locations and the results are not simple to combine due to differences between datasets. A query model was developed to make it simple to distribute queries across different datasets using RDF as the result format. The query model, based on the concept of publicly recognised namespaces for parts of each scientific dataset, was implemented with a configuration that includes a large number of current biological and chemical datasets. The configuration is flexible, providing the ability to transparently use both private and public datasets in any query. A prototype implementation of the model was used to resolve queries for the Bio2RDF website, including both Bio2RDF datasets and other datasets that do not follow the Bio2RDF URI conventions.
Resumo:
We describe a novel two stage approach to object localization and tracking using a network of wireless cameras and a mobile robot. In the first stage, a robot travels through the camera network while updating its position in a global coordinate frame which it broadcasts to the cameras. The cameras use this information, along with image plane location of the robot, to compute a mapping from their image planes to the global coordinate frame. This is combined with an occupancy map generated by the robot during the mapping process to track the objects. We present results with a nine node indoor camera network to demonstrate that this approach is feasible and offers acceptable level of accuracy in terms of object locations.
Resumo:
This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern- based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experiments have been conducted to compare the proposed two-stage filtering (T-SM) model with other possible "term-based + pattern-based" or "term-based + term-based" IF models. The results based on the RCV1 corpus show that the T-SM model significantly outperforms other types of "two-stage" IF models.
Resumo:
Universities continue to struggle with the need to combine the pedagogical benefits of collaborative learning with large scale, interactive and technologically sophisticated learning and teaching arrproaches and support systems. This challenge requires imaginative approaches if the outcome is not to the 'worst of both worlds' that results in confusion and disillusionism amongst students. This paper presents three case studies that use online technologies to provide collaborative teaching solutions arguably much superior to that possible without an online intervention.
Resumo:
While recent research has provided valuable information as to the composition of laser printer particles, their formation mechanisms, and explained why some printers are emitters whilst others are low emitters, fundamental questions relating to the potential exposure of office workers remained unanswered. In particular, (i) what impact does the operation of laser printers have on the background particle number concentration (PNC) of an office environment over the duration of a typical working day?; (ii) what is the airborne particle exposure to office workers in the vicinity of laser printers; (iii) what influence does the office ventilation have upon the transport and concentration of particles?; (iv) is there a need to control the generation of, and/or transport of particles arising from the operation of laser printers within an office environment?; (v) what instrumentation and methodology is relevant for characterising such particles within an office location? We present experimental evidence on printer temporal and spatial PNC during the operation of 107 laser printers within open plan offices of five buildings. We show for the first time that the eight-hour time-weighted average printer particle exposure is significantly less than the eight-hour time-weighted local background particle exposure, but that peak printer particle exposure can be greater than two orders of magnitude higher than local background particle exposure. The particle size range is predominantly ultrafine (< 100nm diameter). In addition we have established that office workers are constantly exposed to non-printer derived particle concentrations, with up to an order of magnitude difference in such exposure amongst offices, and propose that such exposure be controlled along with exposure to printer derived particles. We also propose, for the first time, that peak particle reference values be calculated for each office area analogous to the criteria used in Australia and elsewhere for evaluating exposure excursion above occupational hazardous chemical exposure standards. A universal peak particle reference value of 2.0 x 104 particles cm-3 has been proposed.
Resumo:
This paper presents a framework for evaluating information retrieval of medical records. We use the BLULab corpus, a large collection of real-world de-identified medical records. The collection has been hand coded by clinical terminol- ogists using the ICD-9 medical classification system. The ICD codes are used to devise queries and relevance judge- ments for this collection. Results of initial test runs using a baseline IR system are provided. Queries and relevance judgements are online to aid further research in medical IR. Please visit: http://koopman.id.au/med_eval.
Resumo:
Modelling how a word is activated in human memory is an important requirement for determining the probability of recall of a word in an extra-list cueing experiment. The spreading activation, spooky-action-at-a-distance and entanglement models have all been used to model the activation of a word. Recently a hypothesis was put forward that the mean activation levels of the respective models are as follows: Spreading � Entanglment � Spooking-action-at-a-distance This article investigates this hypothesis by means of a substantial empirical analysis of each model using the University of South Florida word association, rhyme and word norms.
Resumo:
Since manually constructing domain-specific sentiment lexicons is extremely time consuming and it may not even be feasible for domains where linguistic expertise is not available. Research on the automatic construction of domain-specific sentiment lexicons has become a hot topic in recent years. The main contribution of this paper is the illustration of a novel semi-supervised learning method which exploits both term-to-term and document-to-term relations hidden in a corpus for the construction of domain specific sentiment lexicons. More specifically, the proposed two-pass pseudo labeling method combines shallow linguistic parsing and corpusbase statistical learning to make domain-specific sentiment extraction scalable with respect to the sheer volume of opinionated documents archived on the Internet these days. Another novelty of the proposed method is that it can utilize the readily available user-contributed labels of opinionated documents (e.g., the user ratings of product reviews) to bootstrap the performance of sentiment lexicon construction. Our experiments show that the proposed method can generate high quality domain-specific sentiment lexicons as directly assessed by human experts. Moreover, the system generated domain-specific sentiment lexicons can improve polarity prediction tasks at the document level by 2:18% when compared to other well-known baseline methods. Our research opens the door to the development of practical and scalable methods for domain-specific sentiment analysis.
Resumo:
Particle number concentrations and size distributions, visibility and particulate mass concentrations and weather parameters were monitored in Brisbane, Australia, on 23 September 2009, during the passage of a dust storm that originated 1400 km away in the dry continental interior. The dust concentration peaked at about mid-day when the hourly average PM2.5 and PM10 values reached 814 and 6460 µg m-3, respectively, with a sharp drop in atmospheric visibility. A linear regression analysis showed a good correlation between the coefficient of light scattering by particles (Bsp) and both PM10 and PM2.5. The particle number in the size range 0.5-20 µm exhibited a lognormal size distribution with modal and geometrical mean diameters of 1.6 and 1.9 µm, respectively. The modal mass was around 10 µm with less than 10% of the mass carried by particles smaller than 2.5 µm. The PM10 fraction accounted for about 68% of the total mass. By mid-day, as the dust began to increase sharply, the ultrafine particle number concentration fell from about 6x103 cm-3 to 3x103 cm-3 and then continued to decrease to less than 1x103 cm-3 by 14h, showing a power-law decrease with Bsp with an R2 value of 0.77 (p<0.01). Ultrafine particle size distributions also showed a significant decrease in number during the dust storm. This is the first scientific study of particle size distributions in an Australian dust storm.
Resumo:
Models of word meaning, built from a corpus of text, have demonstrated success in emulating human performance on a number of cognitive tasks. Many of these models use geometric representations of words to store semantic associations between words. Often word order information is not captured in these models. The lack of structural information used by these models has been raised as a weakness when performing cognitive tasks. This paper presents an efficient tensor based approach to modelling word meaning that builds on recent attempts to encode word order information, while providing flexible methods for extracting task specific semantic information.
Resumo:
How do humans respond to their social context? This question is becoming increasingly urgent in a society where democracy requires that the citizens of a country help to decide upon its policy directions, and yet those citizens frequently have very little knowledge of the complex issues that these policies seek to address. Frequently, we find that humans make their decisions more with reference to their social setting, than to the arguments of scientists, academics, and policy makers. It is broadly anticipated that the agent based modelling (ABM) of human behaviour will make it possible to treat such social effects, but we take the position here that a more sophisticated treatment of context will be required in many such models. While notions such as historical context (where the past history of an agent might affect its later actions) and situational context (where the agent will choose a different action in a different situation) abound in ABM scenarios, we will discuss a case of a potentially changing context, where social effects can have a strong influence upon the perceptions of a group of subjects. In particular, we shall discuss a recently reported case where a biased worm in an election debate led to significant distortions in the reports given by participants as to who won the debate (Davis et al 2011). Thus, participants in a different social context drew different conclusions about the perceived winner of the same debate, with associated significant differences among the two groups as to who they would vote for in the coming election. We extend this example to the problem of modelling the likely electoral responses of agents in the context of the climate change debate, and discuss the notion of interference between related questions that might be asked of an agent in a social simulation that was intended to simulate their likely responses. A modelling technology which could account for such strong social contextual effects would benefit regulatory bodies which need to navigate between multiple interests and concerns, and we shall present one viable avenue for constructing such a technology. A geometric approach will be presented, where the internal state of an agent is represented in a vector space, and their social context is naturally modelled as a set of basis states that are chosen with reference to the problem space.
Resumo:
Language Modeling (LM) has been successfully applied to Information Retrieval (IR). However, most of the existing LM approaches only rely on term occurrences in documents, queries and document collections. In traditional unigram based models, terms (or words) are usually considered to be independent. In some recent studies, dependence models have been proposed to incorporate term relationships into LM, so that links can be created between words in the same sentence, and term relationships (e.g. synonymy) can be used to expand the document model. In this study, we further extend this family of dependence models in the following two ways: (1) Term relationships are used to expand query model instead of document model, so that query expansion process can be naturally implemented; (2) We exploit more sophisticated inferential relationships extracted with Information Flow (IF). Information flow relationships are not simply pairwise term relationships as those used in previous studies, but are between a set of terms and another term. They allow for context-dependent query expansion. Our experiments conducted on TREC collections show that we can obtain large and significant improvements with our approach. This study shows that LM is an appropriate framework to implement effective query expansion.
Resumo:
In information retrieval, a user's query is often not a complete representation of their real information need. The user's information need is a cognitive construction, however the use of cognitive models to perform query expansion have had little study. In this paper, we present a cognitively motivated query expansion technique that uses semantic features for use in ad hoc retrieval. This model is evaluated against a state-of-the-art query expansion technique. The results show our approach provides significant improvements in retrieval effectiveness for the TREC data sets tested.
Resumo:
Intuitively, any ‘bag of words’ approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies have been mixed or inconclusive. To improve the situation, this paper shows how the natural language properties of the target documents can be used to transform and enrich the term dependencies to more useful statistics. This is done in three steps. The term co-occurrence statistics of queries and documents are each represented by a Markov chain. The paper proves that such a chain is ergodic, and therefore its asymptotic behavior is unique, stationary, and independent of the initial state. Next, the stationary distribution is taken to model queries and documents, rather than their initial distributions. Finally, ranking is achieved following the customary language modeling paradigm. The main contribution of this paper is to argue why the asymptotic behavior of the document model is a better representation then just the document’s initial distribution. A secondary contribution is to investigate the practical application of this representation in case the queries become increasingly verbose. In the experiments (based on Lemur’s search engine substrate) the default query model was replaced by the stable distribution of the query. Just modeling the query this way already resulted in significant improvements over a standard language model baseline. The results were on a par or better than more sophisticated algorithms that use fine-tuned parameters or extensive training. Moreover, the more verbose the query, the more effective the approach seems to become.
Resumo:
Information mismatch and overload are two fundamental issues influencing the effectiveness of information filtering systems. Even though both term-based and pattern-based approaches have been proposed to address the issues, neither of these approaches alone can provide a satisfactory decision for determining the relevant information. This paper presents a novel two-stage decision model for solving the issues. The first stage is a novel rough analysis model to address the overload problem. The second stage is a pattern taxonomy mining model to address the mismatch problem. The experimental results on RCV1 and TREC filtering topics show that the proposed model significantly outperforms the state-of-the-art filtering systems.