901 resultados para Interdependent document relevance
Resumo:
Recent advances in neural language models have contributed new methods for learning distributed vector representations of words (also called word embeddings). Two such methods are the continuous bag-of-words model and the skipgram model. These methods have been shown to produce embeddings that capture higher order relationships between words that are highly effective in natural language processing tasks involving the use of word similarity and word analogy. Despite these promising results, there has been little analysis of the use of these word embeddings for retrieval. Motivated by these observations, in this paper, we set out to determine how these word embeddings can be used within a retrieval model and what the benefit might be. To this aim, we use neural word embeddings within the well known translation language model for information retrieval. This language model captures implicit semantic relations between the words in queries and those in relevant documents, thus producing more accurate estimations of document relevance. The word embeddings used to estimate neural language models produce translations that differ from previous translation language model approaches; differences that deliver improvements in retrieval effectiveness. The models are robust to choices made in building word embeddings and, even more so, our results show that embeddings do not even need to be produced from the same corpus being used for retrieval.
Resumo:
Deviations of policy interest rates from the levels implied by the Taylor rule have been persistent before the financial crisis and increased especially after the turn of the century. Compared to the Taylor benchmark, policy rates were often too low. This paper provides evidence that both international spillovers, for instance international dependencies in the interest rate-setting of central banks, and nonlinear reaction patterns can offer a more realistic specification of the Taylor rule in the main industrial countries. The inclusion of international spillovers and, even more, nonlinear dynamics improves the explanatory power of standard Taylor reaction functions. Deviations from Taylor rates tend to be smaller and their negative trend can be eliminated.
Resumo:
The study proposes to test the ‘IS-Impact’ index as Analytic Theory (AT). To (a) methodically evaluate the ‘relevance’ qualities of IS-Impact; namely, Utility & Intuitiveness. In so doing, to (b) document an exemplar of ‘a rigorous approach to relevance’, while (c) treating the overarching study as a higher-order case study having AT as the unit-of-analysis, and assessing adequacy of the 6 AT qualities, both for IS-Impact and for similar taxonomies. Also to (d) look beyond IS-Impact to other forms of Design Science, considering the generality of the AT qualities; and (e) further validating IS-Impact in new system organisation contexts taking account of contemporary understandings of construct theorisation, operationalization and validation.
Resumo:
In information retrieval, a user's query is often not a complete representation of their real information need. The user's information need is a cognitive construction, however the use of cognitive models to perform query expansion have had little study. In this paper, we present a cognitively motivated query expansion technique that uses semantic features for use in ad hoc retrieval. This model is evaluated against a state-of-the-art query expansion technique. The results show our approach provides significant improvements in retrieval effectiveness for the TREC data sets tested.
Resumo:
Quantum theory has recently been employed to further advance the theory of information retrieval (IR). A challenging research topic is to investigate the so called quantum-like interference in users’ relevance judgement process, where users are involved to judge the relevance degree of each document with respect to a given query. In this process, users’ relevance judgement for the current document is often interfered by the judgement for previous documents, due to the interference on users’ cognitive status. Research from cognitive science has demonstrated some initial evidence of quantum-like cognitive interference in human decision making, which underpins the user’s relevance judgement process. This motivates us to model such cognitive interference in the relevance judgement process, which in our belief will lead to a better modeling and explanation of user behaviors in relevance judgement process for IR and eventually lead to more user-centric IR models. In this paper, we propose to use probabilistic automaton(PA) and quantum finite automaton (QFA), which are suitable to represent the transition of user judgement states, to dynamically model the cognitive interference when the user is judging a list of documents.
Resumo:
Particles emitted by vehicles are known to cause detrimental health effects, with their size and oxidative potential among the main factors responsible. Therefore, understanding the relationship between traffic composition and both the physical characteristics and oxidative potential of particles is critical. To contribute to the limited knowledge base in this area, we investigated this relationship in a 4.5 km road tunnel in Brisbane, Australia. On-road concentrations of ultrafine particles (<100 nm, UFPs), fine particles (PM2.5), CO, CO2 and particle associated reactive oxygen species (ROS) were measured using vehicle-based mobile sampling. UFPs were measured using a condensation particle counter and PM2.5 with a DustTrak aerosol photometer. A new profluorescent nitroxide probe, BPEAnit, was used to determine ROS levels. Comparative measurements were also performed on an above-ground road to assess the role of emission dilution on the parameters measured. The profile of UFP and PM2.5 concentration with distance through the tunnel was determined, and demonstrated relationships with both road gradient and tunnel ventilation. ROS levels in the tunnel were found to be high compared to an open road with similar traffic characteristics, which was attributed to the substantial difference in estimated emission dilution ratios on the two roadways. Principal component analysis (PCA) revealed that the levels of pollutants and ROS were generally better correlated with total traffic count, rather than the traffic composition (i.e. diesel and gasoline-powered vehicles). A possible reason for the lack of correlation with HDV, which has previously been shown to be strongly associated with UFPs especially, was the low absolute numbers encountered during the sampling. This may have made their contribution to in-tunnel pollution largely indistinguishable from the total vehicle volume. For ROS, the stronger association observed with HDV and gasoline vehicles when combined (total traffic count) compared to when considered individually may signal a role for the interaction of their emissions as a determinant of on-road ROS in this pilot study. If further validated, this should not be overlooked in studies of on- or near-road particle exposure and its potential health effects.
Resumo:
This paper analyses the pairwise distances of signatures produced by the TopSig retrieval model on two document collections. The distribution of the distances are compared to purely random signatures. It explains why TopSig is only competitive with state of the art retrieval models at early precision. Only the local neighbourhood of the signatures is interpretable. We suggest this is a common property of vector space models.
Resumo:
A known limitation of the Probability Ranking Principle (PRP) is that it does not cater for dependence between documents. Recently, the Quantum Probability Ranking Principle (QPRP) has been proposed, which implicitly captures dependencies between documents through “quantum interference”. This paper explores whether this new ranking principle leads to improved performance for subtopic retrieval, where novelty and diversity is required. In a thorough empirical investigation, models based on the PRP, as well as other recently proposed ranking strategies for subtopic retrieval (i.e. Maximal Marginal Relevance (MMR) and Portfolio Theory(PT)), are compared against the QPRP. On the given task, it is shown that the QPRP outperforms these other ranking strategies. And unlike MMR and PT, one of the main advantages of the QPRP is that no parameter estimation/tuning is required; making the QPRP both simple and effective. This research demonstrates that the application of quantum theory to problems within information retrieval can lead to significant improvements.
Resumo:
JS-2 is a novel gene located at 5p15.2 and originally detected in primary oesophageal cancer. There is no study on the role of JS-2 in colorectal cancer. The aim of this study is to determine the gene copy number and expression of JS-2 in a large cohort of patients with colorectal tumours and correlate these to the clinicopathological features of the cancer patients. We evaluated the DNA copy number and mRNA expression of JS-2 in 176 colorectal tissues (116 adenocarcinomas, 30 adenomas and 30 non-neoplastic tissues) using real-time polymerase chain reaction. JS-2 expression was also evaluated in two colorectal cancer cell lines and a benign colorectal cell line. JS-2 amplification was noted in 35% of the colorectal adenocarcinomas. Significant differences in relative expression levels for JS-2 mRNA between different colorectal tissues were noted (p = 0.05). Distal colorectal adenocarcinoma had significantly higher copy number than proximal adenocarcinoma (p = 0.005). The relative expression level of JS-2 was different between colonic and rectal adenocarcinoma (p = 0.007). Mucinous adenocarcinoma showed higher JS-2 expression than non-mucinous adenocarcinoma (p = 0.02). Early T-stage cancers appear to have higher JS-2 copy number and lower expression of JS-2 mRNA than later stage cancers (p = 0.001 and 0.03 respectively). Colorectal cancer cell lines showed lower expression of JS-2 than the benign colorectal cell line. JS-2 copy number change and expression were shown for the first time to be altered in the carcinogenesis of colorectal cancer. In addition, genetic alteration of JS-2 was found to be related to location, pathological subtypes and staging of colorectal cancer.
Resumo:
"For myself, I am an optimist - it does not seem to be much use to be anything else". Winston Churchill Optimism has its modern roots in philosophy dating back to the 17th century in the writings of philosophers such as Descartes and Voltaire (Domino & Conway, 2001). Previous to these philosophical writings, the concept of optimism was revealed in the teaching of many of the great spiritual traditions such as Buddhism and Christianity (Miller, Richards, & Keller, 2001). In the 20th century, optimism became defined in juxtaposition to pessimism, sometimes conceptualized as a bipolar unidimensional construct and by others as two related but separate constructs (Garber, 2000). Contemporary models (Scheier & Carver, 1985; Seligman, 1991) have increasingly focused on distinguishing optimism-pessimism as a general dispositional orientation, as described by expectancy theory, and as an explanatory process, described by explanatory style theory.
Resumo:
A key concept in many Information Retrieval (IR) tasks, e.g. document indexing, query language modelling, aspect and diversity retrieval, is the relevance measurement of topics, i.e. to what extent an information object (e.g. a document or a query) is about the topics. This paper investigates the interference of relevance measurement of a topic caused by another topic. For example, consider that two user groups are required to judge whether a topic q is relevant to a document d, and q is presented together with another topic (referred to as a companion topic). If different companion topics are used for different groups, interestingly different relevance probabilities of q given d can be reached. In this paper, we present empirical results showing that the relevance of a topic to a document is greatly affected by the companion topic’s relevance to the same document, and the extent of the impact differs with respect to different companion topics. We further analyse the phenomenon from classical and quantum-like interference perspectives, and connect the phenomenon to nonreality and contextuality in quantum mechanics. We demonstrate that quantum like model fits in the empirical data, could be potentially used for predicting the relevance when interference exists.
Resumo:
Previous qualitative research has highlighted that temporality plays an important role in relevance for clinical records search. In this study, an investigation is undertaken to determine the effect that the timespan of events within a patient record has on relevance in a retrieval scenario. In addition, based on the standard practise of document length normalisation, a document timespan normalisation model that specifically accounts for timespans is proposed. Initial analysis revealed that in general relevant patient records tended to cover a longer timespan of events than non-relevant patient records. However, an empirical evaluation using the TREC Medical Records track supports the opposite view that shorter documents (in terms of timespan) are better for retrieval. These findings highlight that the role of temporality in relevance is complex and how to effectively deal with temporality within a retrieval scenario remains an open question.
Resumo:
This thesis studies document signatures, which are small representations of documents and other objects that can be stored compactly and compared for similarity. This research finds that document signatures can be effectively and efficiently used to both search and understand relationships between documents in large collections, scalable enough to search a billion documents in a fraction of a second. Deliverables arising from the research include an investigation of the representational capacity of document signatures, the publication of an open-source signature search platform and an approach for scaling signature retrieval to operate efficiently on collections containing hundreds of millions of documents.
Resumo:
Background Project archives are becoming increasingly large and complex. On construction projects in particular, the increasing amount of information and the increasing complexity of its structure make searching and exploring information in the project archive challenging and time-consuming. Methods This research investigates a query-driven approach that represents new forms of contextual information to help users understand the set of documents resulting from queries of construction project archives. Specifically, this research extends query-driven interface research by representing three types of contextual information: (1) the temporal context is represented in the form of a timeline to show when each document was created; (2) the search-relevance context shows exactly which of the entered keywords matched each document; and (3) the usage context shows which project participants have accessed or modified a file. Results We implemented and tested these ideas within a prototype query-driven interface we call VisArchive. VisArchive employs a combination of multi-scale and multi-dimensional timelines, color-coded stacked bar charts, additional supporting visual cues and filters to support searching and exploring historical project archives. The timeline-based interface integrates three interactive timelines as focus + context visualizations. Conclusions The feasibility of using these visual design principles is tested in two types of project archives: searching construction project archives of an educational building project and tracking of software defects in the Mozilla Thunderbird project. These case studies demonstrate the applicability, usefulness and generality of the design principles implemented.