163 resultados para Guido, Tomás
Resumo:
Many existing information retrieval models do not explicitly take into account in- formation about word associations. Our approach makes use of rst and second order relationships found in natural language, known as syntagmatic and paradigmatic associ- ations, respectively. This is achieved by using a formal model of word meaning within the query expansion process. On ad hoc retrieval, our approach achieves statistically sig- ni cant improvements in MAP (0.158) and P@20 (0.396) over our baseline model. The ERR@20 and nDCG@20 of our system was 0.249 and 0.192 respectively. Our results and discussion suggest that information about both syntagamtic and paradigmatic associa- tions can assist with improving retrieval eectiveness on ad hoc retrieval.
Resumo:
This paper presents a graph-based method to weight medical concepts in documents for the purposes of information retrieval. Medical concepts are extracted from free-text documents using a state-of-the-art technique that maps n-grams to concepts from the SNOMED CT medical ontology. In our graph-based concept representation, concepts are vertices in a graph built from a document, edges represent associations between concepts. This representation naturally captures dependencies between concepts, an important requirement for interpreting medical text, and a feature lacking in bag-of-words representations. We apply existing graph-based term weighting methods to weight medical concepts. Using concepts rather than terms addresses vocabulary mismatch as well as encapsulates terms belonging to a single medical entity into a single concept. In addition, we further extend previous graph-based approaches by injecting domain knowledge that estimates the importance of a concept within the global medical domain. Retrieval experiments on the TREC Medical Records collection show our method outperforms both term and concept baselines. More generally, this work provides a means of integrating background knowledge contained in medical ontologies into data-driven information retrieval approaches.
Resumo:
Persistent, lipophilic organochlorine pesticides (OCPs) such as dichlorodiphenyltrichloroethane (DDT), hexachlorocyclohexanes (HCHs), dieldrin, chlordanes, hexachlorobenzene (HCB) and mirex are known to accumulate in human samples [1, 2]. Persistent OCPs are among the chemicals that are covered under the Stockholm Convention on persistent organic pollutants [3]. Exceptions to this include relatively less lipophillic compounds like HCH (KOW<10^5). In Australia, OCPs such as DDT and HCHs were introduced in the 1940s. This followed a period of widespread use until the 1970s when recognition of risks related to OCPs resulted in reduced use and their ultimate ban in the 1980s. Mirex, however, remained in very restricted use in Northern Australia for treatment of one species of termites (the Giant Termite (Mastotermes darwinensis)) but this use was phased out in 2007.
Resumo:
Population-representative data for dioxin and PCB congener concentrations are available for the Australian population based on measurements in age- and gender-specific serum pools.1 Such data provide a basis for characterizing the mean concentrations of these compounds in the population, but do not provide information on the inter-individual variation in serum concentrations that may exist in the population within an age- and gender-specific group. Such variation may occur due to inter-individual differences in long-term exposure levels or elimination rates. Reference values are estimates of upper percentiles (often the 95th percentile) of measured values in a defined population that can be used to evaluate data from individuals in the population in order to identify concentrations that are elevated, for example, from occupational exposures.2 The objective of this analysis is to estimate reference values corresponding to the 95th percentile (RV95s) for Australia on an age-specific basis for individual dioxin-like congeners based on measurements in serum pools from Toms and Mueller (2010).
Resumo:
From human biomonitoring data that are increasingly collected in the United States, Australia, and in other countries from large-scale field studies, we obtain snap-shots of concentration levels of various persistent organic pollutants (POPs) within a cross section of the population at different times. Not only can we observe the trends within this population with time, but we can also gain information going beyond the obvious time trends. By combining the biomonitoring data with pharmacokinetic modeling, we can re-construct the time-variant exposure to individual POPs, determine their intrinsic elimination half-lives in the human body, and predict future levels of POPs in the population. Different approaches have been employed to extract information from human biomonitoring data. Pharmacokinetic (PK) models were combined with longitudinal data1, with single2 or multiple3 average concentrations of a cross-sectional data (CSD), or finally with multiple CSD with or without empirical exposure data4. In the latter study, for the first time, the authors based their modeling outputs on two sets of CSD and empirical exposure data, which made it possible that their model outputs were further constrained due to the extensive body of empirical measurements. Here we use a PK model to analyze recent levels of PBDE concentrations measured in the Australian population. In this study, we are able to base our model results on four sets5-7 of CSD; we focus on two PBDE congeners that have been shown3,5,8-9 to differ in intake rates and half-lives with BDE-47 being associated with high intake rates and a short half-life and BDE-153 with lower intake rates and a longer half-life. By fitting the model to PBDE levels measured in different age groups in different years, we determine the level of intake of BDE-47 and BDE-153, as well as the half-lives of these two chemicals in the Australian population.
Resumo:
Measures of semantic similarity between medical concepts are central to a number of techniques in medical informatics, including query expansion in medical information retrieval. Previous work has mainly considered thesaurus-based path measures of semantic similarity and has not compared different corpus-driven approaches in depth. We evaluate the effectiveness of eight common corpus-driven measures in capturing semantic relatedness and compare these against human judged concept pairs assessed by medical professionals. Our results show that certain corpus-driven measures correlate strongly (approx 0.8) with human judgements. An important finding is that performance was significantly affected by the choice of corpus used in priming the measure, i.e., used as evidence from which corpus-driven similarities are drawn. This paper provides guidelines for the implementation of semantic similarity measures for medical informatics and concludes with implications for medical information retrieval.
Resumo:
Norms regulate the behaviour of their subjects and define what is legal and what is illegal. Norms typically describe the conditions under which they are applicable and the normative effects as a results of their applications. On the other hand, process models specify how a business operation or service is to be carried out to achieve a desired outcome. Norms can have significant impact on how business operations are conducted and they can apply to the whole or part of a business process. For example, they may impose conditions on the different aspects of a process (e.g., perform tasks in a specific sequence (control-flow), at a specific time or within a certain time frame (temporal aspect), by specific people (resources)). We propose a framework that provides the formal semantics of the normative requirements for determining whether a business process complies with a normative document (where a normative document can be understood in a very broad sense, ranging from internal policies to best practice policies, to statutory acts). We also present a classification of normal requirements based on the notion of different types of obligations and the effects of violating these obligations.
Resumo:
Existing compliance management frameworks (CMFs) offer a multitude of compliance management capabilities that makes difficult for enterprises to decide on the suitability of a framework. Making a decision on the suitability requires a deep understanding of the functionalities of a framework. Gaining such an understanding is a difficult task which, in turn, requires specialised tools and methodologies for evaluation. Current compliance research lacks such tools and methodologies for evaluating CMFs. This paper reports a methodological evaluation of existing CMFs based on a pre-defined evaluation criteria. Our evaluation highlights what existing CMFs offer, and what they cannot. Also, it underpins various open questions and discusses the challenges in this direction.
Resumo:
A user’s query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques ignore information about the dependencies that exist between words in natural language. However, more recent approaches have demonstrated that by explicitly modeling associations between terms significant improvements in retrieval effectiveness can be achieved over those that ignore these dependencies. State-of-the-art dependency-based approaches have been shown to primarily model syntagmatic associations. Syntagmatic associations infer a likelihood that two terms co-occur more often than by chance. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process will improve retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
Resumo:
Many successful query expansion techniques ignore information about the term dependencies that exist within natural language. However, researchers have recently demonstrated that consistent and significant improvements in retrieval effectiveness can be achieved by explicitly modelling term dependencies within the query expansion process. This has created an increased interest in dependency-based models. State-of-the-art dependency-based approaches primarily model term associations known within structural linguistics as syntagmatic associations, which are formed when terms co-occur together more often than by chance. However, structural linguistics proposes that the meaning of a word is also dependent on its paradigmatic associations, which are formed between words that can substitute for each other without effecting the acceptability of a sentence. Given the reliance on word meanings when a user formulates their query, our approach takes the novel step of modelling both syntagmatic and paradigmatic associations within the query expansion process based on the (pseudo) relevant documents returned in web search. The results demonstrate that this approach can provide significant improvements in web re- trieval effectiveness when compared to a strong benchmark retrieval system.
Resumo:
The article focuses on how the information seeker makes decisions about relevance. It will employ a novel decision theory based on quantum probabilities. This direction derives from mounting research within the field of cognitive science showing that decision theory based on quantum probabilities is superior to modelling human judgements than standard probability models [2, 1]. By quantum probabilities, we mean decision event space is modelled as vector space rather than the usual Boolean algebra of sets. In this way,incompatible perspectives around a decision can be modelled leading to an interference term which modifies the law of total probability. The interference term is crucial in modifying the probability judgements made by current probabilistic systems so they align better with human judgement. The goal of this article is thus to model the information seeker user as a decision maker. For this purpose, signal detection models will be sketched which are in principle applicable in a wide variety of information seeking scenarios.
Resumo:
Time plays an important role in norms. In this paper we start from our previously proposed classification of obligations, and point out some shortcomings of Event Calculus (EC) to represent obligations. We proposed an extension of EC that avoids such shortcomings and we show how to use it to model the various types of obligations.
Resumo:
Bisphenol A (BPA or 4,4’-(propane-2,2-diyl)diphenol) is a chemical intermediate in the production of polycarbonate and epoxy resins, and used in a wide range of applications. BPA has attracted significant attention in the past decade due to its frequency of detection in human populations worldwide, demonstrated animal toxicity and potential impact on human health, particularly during critical periods of development. The aim of this study was to perform a preliminary assessment of age-related trends in urinary concentration and to estimate daily excretion of BPA in Australian children (aged (>0 – <5 years) and adults (≥15 – <75 years). This was achieved using 79 samples pooled by age and gender, created from 868 individual samples of convenience collected as part of routine, community-based pathology testing. Total BPA was analyzed using online-SPE-LC-MS/MS and detected in all samples with a range of 0.65 – 265 ng/ml. No significant differences were observed between males and females. A urine flow model was constructed from published values and used to provide an estimate of daily excretion per unit bodyweight for each pooled sample. The daily excretion estimates ranged from 26.2 – 18200 ng/kg-d for children; and 20.1 – 165 ng/kg-d for adults. Urinary concentrations and estimated excretion rates were inversely associated with age, and estimated daily excretion rates in infants and young children were significantly higher than in adults (geometric mean: 107 and 47.0 ng/kg-d, respectively). Higher excretion of BPA in children may be explained by their higher food consumption relative to body weight compared to adults and adolescents, and may also reflect alternative exposure pathways and sources. Keywords: bisphenol A, biomonitoring, children, urine flow, Australia
Resumo:
Biomonitoring has become the ‘gold standard’ in assessing chemical exposures, and plays an important role in risk assessment. The pooling of biological specimens – combining multiple individual specimens into a single sample – can be used in biomonitoring studies to monitor levels of exposure and identify exposure trends, or to identify susceptible populations in a cost-effective manner. Pooled samples provide an estimate of central tendency, and may also reveal information about variation within the population. The development of a pooling strategy requires careful consideration of the type and number of samples collected, the number of pools required, and the number of specimens to combine per pool in order to maximize the type and robustness of the data. Creative pooling strategies can be used to explore exposure-outcome associations, and extrapolation from other larger studies can be useful in identifying elevated exposures in specific individuals. The use of pooled specimens is advantageous as it saves significantly on analytical costs, may reduce the time and resources required for recruitment, and in certain circumstances, allows quantification of samples approaching the limit of detection. In addition, use of pooled samples can provide population estimates while avoiding ethical difficulties that may be associated with reporting individual results.
Resumo:
The period of developmental vulnerability to toxicants begins at conception and extends through gestation, parturition, infanthood and childhood to adolescence. The concern is that children: (1) may experience quantitatively and qualitatively different exposures, and (2) may have different sensitivity to chemical pollutants. Traditional toxicological studies are inappropriate for assessing the results of chronic exposure at very low levels during critical periods of development. This paper will discuss (1) the health effects associated with exposure to selected emerging organic pollutants, including brominated flame retardants, perfluorinated compounds, organophosphate pesticides and bisphenol A; (2) difficulties in monitoring these substances in children, and (3) suggest techniques and strategies for overcoming these difficulties. Such biomonitoring data can be used to identify where policies should be directed in order to reduce exposure, and to document policies that have successfully reduced exposure.