105 resultados para Beatriz Guido
Resumo:
In [8] the authors developed a logical system based on the definition of a new non-classical connective ⊗ capturing the notion of reparative obligation. The system proved to be appropriate for handling well-known contrary-to-duty paradoxes but no model-theoretic semantics was presented. In this paper we fill the gap and define a suitable possible-world semantics for the system for which we can prove soundness and completeness. The semantics is a preference-based non-normal one extending and generalizing semantics for classical modal logics.
Resumo:
Previous qualitative research has highlighted that temporality plays an important role in relevance for clinical records search. In this study, an investigation is undertaken to determine the effect that the timespan of events within a patient record has on relevance in a retrieval scenario. In addition, based on the standard practise of document length normalisation, a document timespan normalisation model that specifically accounts for timespans is proposed. Initial analysis revealed that in general relevant patient records tended to cover a longer timespan of events than non-relevant patient records. However, an empirical evaluation using the TREC Medical Records track supports the opposite view that shorter documents (in terms of timespan) are better for retrieval. These findings highlight that the role of temporality in relevance is complex and how to effectively deal with temporality within a retrieval scenario remains an open question.
Resumo:
Non-monotonic reasoning typically deals with three kinds of knowledge. Facts are meant to describe immutable statements of the environment. Rules define relationships among elements. Lastly, an ordering among the rules, in the form of a superiority relation, establishes the relative strength of rules. To revise a non-monotonic theory, we can change either one of these three elements. We prove that the problem of revising a non-monotonic theory by only changing the superiority relation is a NP-complete problem.
Resumo:
Concept mapping involves determining relevant concepts from a free-text input, where concepts are defined in an external reference ontology. This is an important process that underpins many applications for clinical information reporting, derivation of phenotypic descriptions, and a number of state-of-the-art medical information retrieval methods. Concept mapping can be cast into an information retrieval (IR) problem: free-text mentions are treated as queries and concepts from a reference ontology as the documents to be indexed and retrieved. This paper presents an empirical investigation applying general-purpose IR techniques for concept mapping in the medical domain. A dataset used for evaluating medical information extraction is adapted to measure the effectiveness of the considered IR approaches. Standard IR approaches used here are contrasted with the effectiveness of two established benchmark methods specifically developed for medical concept mapping. The empirical findings show that the IR approaches are comparable with one benchmark method but well below the best benchmark.
Resumo:
The recent trend for journals to require open access to primary data included in publications has been embraced by many biologists, but has caused apprehension amongst researchers engaged in long-term ecological and evolutionary studies. A worldwide survey of 73 principal investigators (Pls) with long-term studies revealed positive attitudes towards sharing data with the agreement or involvement of the PI, and 93% of PIs have historically shared data. Only 8% were in favor of uncontrolled, open access to primary data while 63% expressed serious concern. We present here their viewpoint on an issue that can have non-trivial scientific consequences. We discuss potential costs of public data archiving and provide possible solutions to meet the needs of journals and researchers.
Resumo:
Permissions are special case of deontic effects and play important role compliance. Essentially they are used to determine the obligations or prohibitions to contrary. A formal language e.g., temporal logic, event-calculus et., not able to represent permissions is doomed to be unable to represent most of the real-life legal norms. In this paper we address this issue and extend deontic-event-calculus (DEC) with new predicates for modelling permissions enabling it to elegantly capture the intuition of real-life cases of permissions.
Resumo:
This study investigates the use of unsupervised features derived from word embedding approaches and novel sequence representation approaches for improving clinical information extraction systems. Our results corroborate previous findings that indicate that the use of word embeddings significantly improve the effectiveness of concept extraction models; however, we further determine the influence that the corpora used to generate such features have. We also demonstrate the promise of sequence-based unsupervised features for further improving concept extraction.
Resumo:
Recent advances in neural language models have contributed new methods for learning distributed vector representations of words (also called word embeddings). Two such methods are the continuous bag-of-words model and the skipgram model. These methods have been shown to produce embeddings that capture higher order relationships between words that are highly effective in natural language processing tasks involving the use of word similarity and word analogy. Despite these promising results, there has been little analysis of the use of these word embeddings for retrieval. Motivated by these observations, in this paper, we set out to determine how these word embeddings can be used within a retrieval model and what the benefit might be. To this aim, we use neural word embeddings within the well known translation language model for information retrieval. This language model captures implicit semantic relations between the words in queries and those in relevant documents, thus producing more accurate estimations of document relevance. The word embeddings used to estimate neural language models produce translations that differ from previous translation language model approaches; differences that deliver improvements in retrieval effectiveness. The models are robust to choices made in building word embeddings and, even more so, our results show that embeddings do not even need to be produced from the same corpus being used for retrieval.
Resumo:
Objective Death certificates provide an invaluable source for cancer mortality statistics; however, this value can only be realised if accurate, quantitative data can be extracted from certificates – an aim hampered by both the volume and variable nature of certificates written in natural language. This paper proposes an automatic classification system for identifying cancer related causes of death from death certificates. Methods Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. These features were used to train Support Vector Machine classifiers (one classifier for each cancer type). The classifiers were deployed in a cascaded architecture: the first level identified the presence of cancer (i.e., binary cancer/nocancer) and the second level identified the type of cancer (according to the ICD-10 classification system). A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. In addition, detailed feature analysis was performed to reveal the characteristics of a successful cancer classification model. Results The system was highly effective at identifying cancer as the underlying cause of death (F-measure 0.94). The system was also effective at determining the type of cancer for common cancers (F-measure 0.7). Rare cancers, for which there was little training data, were difficult to classify accurately (F-measure 0.12). Factors influencing performance were the amount of training data and certain ambiguous cancers (e.g., those in the stomach region). The feature analysis revealed a combination of features were important for cancer type classification, with SNOMED CT concept and oncology specific morphology features proving the most valuable. Conclusion The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates.
Resumo:
This paper investigates the effect that text pre-processing approaches have on the estimation of the readability of web pages. Readability has been highlighted as an important aspect of web search result personalisation in previous work. The most widely used text readability measures rely on surface level characteristics of text, such as the length of words and sentences. We demonstrate that different tools for extracting text from web pages lead to very different estimations of readability. This has an important implication for search engines because search result personalisation strategies that consider users reading ability may fail if incorrect text readability estimations are computed.
Resumo:
Theories of search and search behavior can be used to glean insights and generate hypotheses about how people interact with retrieval systems. This paper examines three such theories, the long standing Information Foraging Theory, along with the more recently proposed Search Economic Theory and the Interactive Probability Ranking Principle. Our goal is to develop a model for ad-hoc topic retrieval using each approach, all within a common framework, in order to (1) determine what predictions each approach makes about search behavior, and (2) show the relationships, equivalences and differences between the approaches. While each approach takes a different perspective on modeling searcher interactions, we show that under certain assumptions, they lead to similar hypotheses regarding search behavior. Moreover, we show that the models are complementary to each other, but operate at different levels (i.e., sessions, patches and situations). We further show how the differences between the approaches lead to new insights into the theories and new models. This contribution will not only lead to further theoretical developments, but also enables practitioners to employ one of the three equivalent models depending on the data available.
Resumo:
In our recent paper [1], we discussed some potential undesirable consequences of public data archiving (PDA) with specific reference to long-term studies and proposed solutions to manage these issues. We reaffirm our commitment to data sharing and collaboration, both of which have been common and fruitful practices supported for many decades by researchers involved in long-term studies. We acknowledge the potential benefits of PDA (e.g., [2]), but believe that several potential negative consequences for science have been underestimated [1] (see also 3 and 4). The objective of our recent paper [1] was to define practices to simultaneously maximize the benefits and minimize the potential unwanted consequences of PDA.
Resumo:
The androgen receptor (AR) is the main therapeutic target for advanced prostate cancer (PCa). Current treatments have focused on inhibiting the transcriptional activity of the AR, however androgens can also induce non-genomic effects by facilitating the initiation of kinase signaling cascades in PCa. Cells, including PCa, secrete extracellular vesicles (EV), which are able to mediate communication between cells and can also contribute towards these processes.