203 resultados para Concept-based Retrieval


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In a pilot application based on web search engine calledWeb-based Relation Completion (WebRC), we propose to join two columns of entities linked by a predefined relation by mining knowledge from the web through a web search engine. To achieve this, a novel retrieval task Relation Query Expansion (RelQE) is modelled: given an entity (query), the task is to retrieve documents containing entities in predefined relation to the given one. Solving this problem entails expanding the query before submitting it to a web search engine to ensure that mostly documents containing the linked entity are returned in the top K search results. In this paper, we propose a novel Learning-based Relevance Feedback (LRF) approach to solve this retrieval task. Expansion terms are learned from training pairs of entities linked by the predefined relation and applied to new entity-queries to find entities linked by the same relation. After describing the approach, we present experimental results on real-world web data collections, which show that the LRF approach always improves the precision of top-ranked search results to up to 8.6 times the baseline. Using LRF, WebRC also shows performances way above the baseline.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Electrochemical aptamer-based (E-AB) sensors represent an emerging class of recently developed sensors. However, numerous of these sensors are limited by a low surface density of electrode-bound redox-oligonucleotides which are used as probe. Here we propose to use the concept of electrochemical current rectification (ECR) for the enhancement of the redox signal of E-AB sensors. Commonly, the probe-DNA performs a change in conformation during target binding and enables a nonrecurring charge transfer between redox-tag and electrode. In our system, the redox-tag of the probe-DNA is continuously replenished by solution-phase redox molecules. A unidirectional electron transfer from electrode via surface-linked redox-tag to the solution-phase redox molecules arises that efficiently amplifies the current response. Using this robust and straight-forward strategy, the developed sensor showed a substantial signal amplification and consequently improved sensitivity with a calculated detection limit of 114 nM for ATP, which was improved by one order of magnitude compared with the amplification-free detection and superior to other previous detection results using enzymes or nanomaterials-based signal amplification. To the best of our knowledge, this is the first demonstration of an aptamer-based electrochemical biosensor involving electrochemical rectification, which can be presumably transferred to other biomedical sensor systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The network reconfiguration is an important stage of restoring a power system after a complete blackout or a local outage. Reasonable planning of the network reconfiguration procedure is essential for rapidly restoring the power system concerned. An approach for evaluating the importance of a line is first proposed based on the line contraction concept. Then, the interpretative structural modeling (ISM) is employed to analyze the relationship among the factors having impacts on the network reconfiguration. The security and speediness of restoring generating units are considered with priority, and a method is next proposed to select the generating unit to be restored by maximizing the restoration benefit with both the generation capacity of the restored generating unit and the importance of the line in the restoration path considered. Both the start-up sequence of generating units and the related restoration paths are optimized together in the proposed method, and in this way the shortcomings of separately solving these two issues in the existing methods are avoided. Finally, the New England 10-unit 39-bus power system and the Guangdong power system in South China are employed to demonstrate the basic features of the proposed method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Advances in neural network language models have demonstrated that these models can effectively learn representations of words meaning. In this paper, we explore a variation of neural language models that can learn on concepts taken from structured ontologies and extracted from free-text, rather than directly from terms in free-text. This model is employed for the task of measuring semantic similarity between medical concepts, a task that is central to a number of techniques in medical informatics and information retrieval. The model is built with two medical corpora (journal abstracts and patient records) and empirically validated on two ground-truth datasets of human-judged concept pairs assessed by medical professionals. Empirically, our approach correlates closely with expert human assessors ($\approx$ 0.9) and outperforms a number of state-of-the-art benchmarks for medical semantic similarity. The demonstrated superiority of this model for providing an effective semantic similarity measure is promising in that this may translate into effectiveness gains for techniques in medical information retrieval and medical informatics (e.g., query expansion and literature-based discovery).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Large volumes of heterogeneous health data silos pose a big challenge when exploring for information to allow for evidence based decision making and ensuring quality outcomes. In this paper, we present a proof of concept for adopting data warehousing technology to aggregate and analyse disparate health data in order to understand the impact various lifestyle factors on obesity. We present a practical model for data warehousing with detailed explanation which can be adopted similarly for studying various other health issues.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Decision-making is such an integral aspect in health care routine that the ability to make the right decisions at crucial moments can lead to patient health improvements. Evidence-based practice, the paradigm used to make those informed decisions, relies on the use of current best evidence from systematic research such as randomized controlled trials. Limitations of the outcomes from randomized controlled trials (RCT), such as “quantity” and “quality” of evidence generated, has lowered healthcare professionals’ confidence in using EBP. An alternate paradigm of Practice-Based Evidence has evolved with the key being evidence drawn from practice settings. Through the use of health information technology, electronic health records (EHR) capture relevant clinical practice “evidence”. A data-driven approach is proposed to capitalize on the benefits of EHR. The issues of data privacy, security and integrity are diminished by an information accountability concept. Data warehouse architecture completes the data-driven approach by integrating health data from multi-source systems, unique within the healthcare environment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective This paper presents an automatic active learning-based system for the extraction of medical concepts from clinical free-text reports. Specifically, (1) the contribution of active learning in reducing the annotation effort, and (2) the robustness of incremental active learning framework across different selection criteria and datasets is determined. Materials and methods The comparative performance of an active learning framework and a fully supervised approach were investigated to study how active learning reduces the annotation effort while achieving the same effectiveness as a supervised approach. Conditional Random Fields as the supervised method, and least confidence and information density as two selection criteria for active learning framework were used. The effect of incremental learning vs. standard learning on the robustness of the models within the active learning framework with different selection criteria was also investigated. Two clinical datasets were used for evaluation: the i2b2/VA 2010 NLP challenge and the ShARe/CLEF 2013 eHealth Evaluation Lab. Results The annotation effort saved by active learning to achieve the same effectiveness as supervised learning is up to 77%, 57%, and 46% of the total number of sequences, tokens, and concepts, respectively. Compared to the Random sampling baseline, the saving is at least doubled. Discussion Incremental active learning guarantees robustness across all selection criteria and datasets. The reduction of annotation effort is always above random sampling and longest sequence baselines. Conclusion Incremental active learning is a promising approach for building effective and robust medical concept extraction models, while significantly reducing the burden of manual annotation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fractional anisotropy (FA), a very widely used measure of fiber integrity based on diffusion tensor imaging (DTI), is a problematic concept as it is influenced by several quantities including the number of dominant fiber directions within each voxel, each fiber's anisotropy, and partial volume effects from neighboring gray matter. With High-angular resolution diffusion imaging (HARDI) and the tensor distribution function (TDF), one can reconstruct multiple underlying fibers per voxel and their individual anisotropy measures by representing the diffusion profile as a probabilistic mixture of tensors. We found that FA, when compared with TDF-derived anisotropy measures, correlates poorly with individual fiber anisotropy, and may sub-optimally detect disease processes that affect myelination. By contrast, mean diffusivity (MD) as defined in standard DTI appears to be more accurate. Overall, we argue that novel measures derived from the TDF approach may yield more sensitive and accurate information than DTI-derived measures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

For the first decade of its existence, the concept of citizen journalism has described an approach which was seen as a broadening of the participant base in journalistic processes, but still involved only a comparatively small subset of overall society – for the most part, citizen journalists were news enthusiasts and “political junkies” (Coleman, 2006) who, as some exasperated professional journalists put it, “wouldn’t get a job at a real newspaper” (The Australian, 2007), but nonetheless followed many of the same journalistic principles. The investment – if not of money, then at least of time and effort – involved in setting up a blog or participating in a citizen journalism Website remained substantial enough to prevent the majority of Internet users from engaging in citizen journalist activities to any significant extent; what emerged in the form of news blogs and citizen journalism sites was a new online elite which for some time challenged the hegemony of the existing journalistic elite, but gradually also merged with it. The mass adoption of next-generation social media platforms such as Facebook and Twitter, however, has led to the emergence of a new wave of quasi-journalistic user activities which now much more closely resemble the “random acts of journalism” which JD Lasica envisaged in 2003. Social media are not exclusively or even predominantly used for citizen journalism; instead, citizen journalism is now simply a by-product of user communities engaging in exchanges about the topics which interest them, or tracking emerging stories and events as they happen. Such platforms – and especially Twitter with its system of ad hoc hashtags that enable the rapid exchange of information about issues of interest – provide spaces for users to come together to “work the story” through a process of collaborative gatewatching (Bruns, 2005), content curation, and information evaluation which takes place in real time and brings together everyday users, domain experts, journalists, and potentially even the subjects of the story themselves. Compared to the spaces of news blogs and citizen journalism sites, but also of conventional online news Websites, which are controlled by their respective operators and inherently position user engagement as a secondary activity to content publication, these social media spaces are centred around user interaction, providing a third-party space in which everyday as well as institutional users, laypeople as well as experts converge without being able to control the exchange. Drawing on a number of recent examples, this article will argue that this results in a new dynamic of interaction and enables the emergence of a more broadly-based, decentralised, second wave of citizen engagement in journalistic processes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bioacoustic monitoring has become a significant research topic for species diversity conservation. Due to the development of sensing techniques, acoustic sensors are widely deployed in the field to record animal sounds over a large spatial and temporal scale. With large volumes of collected audio data, it is essential to develop semi-automatic or automatic techniques to analyse the data. This can help ecologists make decisions on how to protect and promote the species diversity. This paper presents generic features to characterize a range of bird species for vocalisation retrieval. In the implementation, audio recordings are first converted to spectrograms using short-time Fourier transform, then a ridge detection method is applied to the spectrogram for detecting points of interest. Based on the detected points, a new region representation are explored for describing various bird vocalisations and a local descriptor including temporal entropy, frequency bin entropy and histogram of counts of four ridge directions is calculated for each sub-region. To speed up the retrieval process, indexing is carried out and the retrieved results are ranked according to similarity scores. The experiment results show that our proposed feature set can achieve 0.71 in term of retrieval success rate which outperforms spectral ridge features alone (0.55) and Mel frequency cepstral coefficients (0.36).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Modelling fluvial processes is an effective way to reproduce basin evolution and to recreate riverbed morphology. However, due to the complexity of alluvial environments, deterministic modelling of fluvial processes is often impossible. To address the related uncertainties, we derive a stochastic fluvial process model on the basis of the convective Exner equation that uses the statistics (mean and variance) of river velocity as input parameters. These statistics allow for quantifying the uncertainty in riverbed topography, river discharge and position of the river channel. In order to couple the velocity statistics and the fluvial process model, the perturbation method is employed with a non-stationary spectral approach to develop the Exner equation as two separate equations: the first one is the mean equation, which yields the mean sediment thickness, and the second one is the perturbation equation, which yields the variance of sediment thickness. The resulting solutions offer an effective tool to characterize alluvial aquifers resulting from fluvial processes, which allows incorporating the stochasticity of the paleoflow velocity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Acoustic recordings of the environment provide an effective means to monitor bird species diversity. To facilitate exploration of acoustic recordings, we describe a content-based birdcall retrieval algorithm. A query birdcall is a region of spectrogram bounded by frequency and time. Retrieval depends on a similarity measure derived from the orientation and distribution of spectral ridges. The spectral ridge detection method caters for a broad range of birdcall structures. In this paper, we extend previous work by incorporating a spectrogram scaling step in order to improve the detection of spectral ridges. Compared to an existing approach based on MFCC features, our feature representation achieves better retrieval performance for multiple bird species in noisy recordings.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As of today, user-generated information such as online reviews has become increasingly significant for customers in decision making process. Meanwhile, as the volume of online reviews proliferates, there is an insistent demand to help the users tackle the information overload problem. In order to extract useful information from overwhelming reviews, considerable work has been proposed such as review summarization and review selection. Particularly, to avoid the redundant information, researchers attempt to select a small set of reviews to represent the entire review corpus by preserving its statistical properties (e.g., opinion distribution). However, one significant drawback of the existing works is that they only measure the utility of the extracted reviews as a whole without considering the quality of each individual review. As a result, the set of chosen reviews may consist of low-quality ones even its statistical property is close to that of the original review corpus, which is not preferred by the users. In this paper, we proposed a review selection method which takes review quality into consideration during the selection process. Specifically, we examine the relationships between product features based upon a domain ontology to capture the review characteristics based on which to select reviews that have good quality and preserve the opinion distribution as well. Our experimental results based on real world review datasets demonstrate that our proposed approach is feasible and able to improve the performance of the review selection effectively.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The concept of energy gap(s) is useful for understanding the consequence of a small daily, weekly, or monthly positive energy balance and the inconspicuous shift in weight gain ultimately leading to overweight and obesity. Energy gap is a dynamic concept: an initial positive energy gap incurred via an increase in energy intake (or a decrease in physical activity) is not constant, may fade out with time if the initial conditions are maintained, and depends on the 'efficiency' with which the readjustment of the energy imbalance gap occurs with time. The metabolic response to an energy imbalance gap and the magnitude of the energy gap(s) can be estimated by at least two methods, i.e. i) assessment by longitudinal overfeeding studies, imposing (by design) an initial positive energy imbalance gap; ii) retrospective assessment based on epidemiological surveys, whereby the accumulated endogenous energy storage per unit of time is calculated from the change in body weight and body composition. In order to illustrate the difficulty of accurately assessing an energy gap we have used, as an illustrative example, a recent epidemiological study which tracked changes in total energy intake (estimated by gross food availability) and body weight over 3 decades in the US, combined with total energy expenditure prediction from body weight using doubly labelled water data. At the population level, the study attempted to assess the cause of the energy gap purported to be entirely due to increased food intake. Based on an estimate of change in energy intake judged to be more reliable (i.e. in the same study population) and together with calculations of simple energetic indices, our analysis suggests that conclusions about the fundamental causes of obesity development in a population (excess intake vs. low physical activity or both) is clouded by a high level of uncertainty.