179 resultados para conditional


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a new active learning query strategy for information extraction, called Domain Knowledge Informativeness (DKI). Active learning is often used to reduce the amount of annotation effort required to obtain training data for machine learning algorithms. A key component of an active learning approach is the query strategy, which is used to iteratively select samples for annotation. Knowledge resources have been used in information extraction as a means to derive additional features for sample representation. DKI is, however, the first query strategy that exploits such resources to inform sample selection. To evaluate the merits of DKI, in particular with respect to the reduction in annotation effort that the new query strategy allows to achieve, we conduct a comprehensive empirical comparison of active learning query strategies for information extraction within the clinical domain. The clinical domain was chosen for this work because of the availability of extensive structured knowledge resources which have often been exploited for feature generation. In addition, the clinical domain offers a compelling use case for active learning because of the necessary high costs and hurdles associated with obtaining annotations in this domain. Our experimental findings demonstrated that 1) amongst existing query strategies, the ones based on the classification model’s confidence are a better choice for clinical data as they perform equally well with a much lighter computational load, and 2) significant reductions in annotation effort are achievable by exploiting knowledge resources within active learning query strategies, with up to 14% less tokens and concepts to manually annotate than with state-of-the-art query strategies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article considers whether the granting of patents in respect of biomedical genetic research should be conditional upon the informed consent of research participants. It focuses upon several case studies. In Moore v the Regents of the University Of California, a patient sued his physician for breach of fiduciary duty and lack of informed consent, because the doctor had obtained a patent on the patient's cell line, without the patient's authorisation. In Greenberg v Miami Children's Hospital, the research participants, the Greenbergs, the National Tay Sachs and Allied Diseases Association, and Dor Yeshorim brought a legal action against the geneticist Reubon Matalon and the Miami Children's Hospital over a patent obtained on a gene related to the Canavan disease and accompany genetic diagnostic test. PXE International entered into a joint venture with Charles Boyd and the University of Hawaii, and obtained a patent together for ‘methods for diagnosing Pseudoxanthoma elasticum’. In light of such case studies, it is contended that there is a need to reform patent law, so as to recognise the bioethical principles of informed consent and benefit-sharing. The 2005 UNESCO Declaration on Bioethics and Human Rights provides a model for future case law and policy-making.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Multilevel and spatial models are being increasingly used to obtain substantive information on area-level inequalities in cancer survival. Multilevel models assume independent geographical areas, whereas spatial models explicitly incorporate geographical correlation, often via a conditional autoregressive prior. However the relative merits of these methods for large population-based studies have not been explored. Using a case-study approach, we report on the implications of using multilevel and spatial survival models to study geographical inequalities in all-cause survival. Methods Multilevel discrete-time and Bayesian spatial survival models were used to study geographical inequalities in all-cause survival for a population-based colorectal cancer cohort of 22,727 cases aged 20–84 years diagnosed during 1997–2007 from Queensland, Australia. Results Both approaches were viable on this large dataset, and produced similar estimates of the fixed effects. After adding area-level covariates, the between-area variability in survival using multilevel discrete-time models was no longer significant. Spatial inequalities in survival were also markedly reduced after adjusting for aggregated area-level covariates. Only the multilevel approach however, provided an estimation of the contribution of geographical variation to the total variation in survival between individual patients. Conclusions With little difference observed between the two approaches in the estimation of fixed effects, multilevel models should be favored if there is a clear hierarchical data structure and measuring the independent impact of individual- and area-level effects on survival differences is of primary interest. Bayesian spatial analyses may be preferred if spatial correlation between areas is important and if the priority is to assess small-area variations in survival and map spatial patterns. Both approaches can be readily fitted to geographically enabled survival data from international settings

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Spatial analysis is increasingly important for identifying modifiable geographic risk factors for disease. However, spatial health data from surveys are often incomplete, ranging from missing data for only a few variables, to missing data for many variables. For spatial analyses of health outcomes, selection of an appropriate imputation method is critical in order to produce the most accurate inferences. Methods We present a cross-validation approach to select between three imputation methods for health survey data with correlated lifestyle covariates, using as a case study, type II diabetes mellitus (DM II) risk across 71 Queensland Local Government Areas (LGAs). We compare the accuracy of mean imputation to imputation using multivariate normal and conditional autoregressive prior distributions. Results Choice of imputation method depends upon the application and is not necessarily the most complex method. Mean imputation was selected as the most accurate method in this application. Conclusions Selecting an appropriate imputation method for health survey data, after accounting for spatial correlation and correlation between covariates, allows more complete analysis of geographic risk factors for disease with more confidence in the results to inform public policy decision-making.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Within online learning communities, receiving timely and meaningful insights into the quality of learning activities is an important part of an effective educational experience. Commonly adopted methods – such as the Community of Inquiry framework – rely on manual coding of online discussion transcripts, which is a costly and time consuming process. There are several efforts underway to enable the automated classification of online discussion messages using supervised machine learning, which would enable the real-time analysis of interactions occurring within online learning communities. This paper investigates the importance of incorporating features that utilise the structure of on-line discussions for the classification of "cognitive presence" – the central dimension of the Community of Inquiry framework focusing on the quality of students' critical thinking within online learning communities. We implemented a Conditional Random Field classification solution, which incorporates structural features that may be useful in increasing classification performance over other implementations. Our approach leads to an improvement in classification accuracy of 5.8% over current existing techniques when tested on the same dataset, with a precision and recall of 0.630 and 0.504 respectively.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To identify susceptibility loci for visceral leishmaniasis, we undertook genome-wide association studies in two populations: 989 cases and 1,089 controls from India and 357 cases in 308 Brazilian families (1,970 individuals). The HLA-DRB1-HLA-DQA1 locus was the only region to show strong evidence of association in both populations. Replication at this region was undertaken in a second Indian population comprising 941 cases and 990 controls, and combined analysis across the three cohorts for rs9271858 at this locus showed P combined = 2.76 × 10 -17 and odds ratio (OR) = 1.41, 95% confidence interval (CI) = 1.30-1.52. A conditional analysis provided evidence for multiple associations within the HLA-DRB1-HLA-DQA1 region, and a model in which risk differed between three groups of haplotypes better explained the signal and was significant in the Indian discovery and replication cohorts. In conclusion, the HLA-DRB1-HLA-DQA1 HLA class II region contributes to visceral leishmaniasis susceptibility in India and Brazil, suggesting shared genetic risk factors for visceral leishmaniasis that cross the epidemiological divides of geography and parasite species. © 2013 Nature America, Inc. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To gain further insight into the genetic architecture of psoriasis, we conducted a meta-analysis of 3 genome-wide association studies (GWAS) and 2 independent data sets genotyped on the Immunochip, including 10,588 cases and 22,806 controls. We identified 15 new susceptibility loci, increasing to 36 the number associated with psoriasis in European individuals. We also identified, using conditional analyses, five independent signals within previously known loci. The newly identified loci shared with other autoimmune diseases include candidate genes with roles in regulating T-cell function (such as RUNX3, TAGAP and STAT3). Notably, they included candidate genes whose products are involved in innate host defense, including interferon-mediated antiviral responses (DDX58), macrophage activation (ZC3H12C) and nuclear factor (NF)-κB signaling (CARD14 and CARM1). These results portend a better understanding of shared and distinctive genetic determinants of immune-mediated inflammatory disorders and emphasize the importance of the skin in innate and acquired host defense. © 2012 Nature America, Inc. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The MFG test is a family-based association test that detects genetic effects contributing to disease in offspring, including offspring allelic effects, maternal allelic effects and MFG incompatibility effects. Like many other family-based association tests, it assumes that the offspring survival and the offspring-parent genotypes are conditionally independent provided the offspring is affected. However, when the putative disease-increasing locus can affect another competing phenotype, for example, offspring viability, the conditional independence assumption fails and these tests could lead to incorrect conclusions regarding the role of the gene in disease. We propose the v-MFG test to adjust for the genetic effects on one phenotype, e.g., viability, when testing the effects of that locus on another phenotype, e.g., disease. Using genotype data from nuclear families containing parents and at least one affected offspring, the v-MFG test models the distribution of family genotypes conditional on offspring phenotypes. It simultaneously estimates genetic effects on two phenotypes, viability and disease. Simulations show that the v-MFG test produces accurate genetic effect estimates on disease as well as on viability under several different scenarios. It generates accurate type-I error rates and provides adequate power with moderate sample sizes to detect genetic effects on disease risk when viability is reduced. We demonstrate the v-MFG test with HLA-DRB1 data from study participants with rheumatoid arthritis (RA) and their parents, we show that the v-MFG test successfully detects an MFG incompatibility effect on RA while simultaneously adjusting for a possible viability loss.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Information available on company websites can help people navigate to the offices of groups and individuals within the company. Automatically retrieving this within-organisation spatial information is a challenging AI problem This paper introduces a novel unsupervised pattern-based method to extract within-organisation spatial information by taking advantage of HTML structure patterns, together with a novel Conditional Random Fields (CRF) based method to identify different categories of within-organisation spatial information. The results show that the proposed method can achieve a high performance in terms of F-Score, indicating that this purely syntactic method based on web search and an analysis of HTML structure is well-suited for retrieving within-organisation spatial information.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Ship seakeeping operability refers to the quantification of motion performance in waves relative to mission requirements. This is used to make decisions about preferred vessel designs, but it can also be used as comprehensive assessment of the benefits of ship-motion-control systems. Traditionally, operability computation aggregates statistics of motion computed over over the envelope of likely environmental conditions in order to determine a coefficient in the range from 0 to 1 called operability. When used for assessment of motion-control systems, the increase of operability is taken as the key performance indicator. The operability coefficient is often given the interpretation of the percentage of time operable. This paper considers an alternative probabilistic approach to this traditional computation of operability. It characterises operability not as a number to which a frequency interpretation is attached, but as a hypothesis that a vessel will attain the desired performance in one mission considering the envelope of likely operational conditions. This enables the use of Bayesian theory to compute the probability of that this hypothesis is true conditional on data from simulations. Thus, the metric considered is the probability of operability. This formulation not only adheres to recent developments in reliability and risk analysis, but also allows incorporating into the analysis more accurate descriptions of ship-motion-control systems since the analysis is not limited to linear ship responses in the frequency domain. The paper also discusses an extension of the approach to the case of assessment of increased levels of autonomy for unmanned marine craft.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Developing accurate and reliable crop detection algorithms is an important step for harvesting automation in horticulture. This paper presents a novel approach to visual detection of highly-occluded fruits. We use a conditional random field (CRF) on multi-spectral image data (colour and Near-Infrared Reflectance, NIR) to model two classes: crop and background. To describe these two classes, we explore a range of visual-texture features including local binary pattern, histogram of oriented gradients, and learn auto-encoder features. The pro-posed methods are evaluated using hand-labelled images from a dataset captured on a commercial capsicum farm. Experimental results are presented, and performance is evaluated in terms of the Area Under the Curve (AUC) of the precision-recall curves.Our current results achieve a maximum performance of 0.81AUC when combining all of the texture features in conjunction with colour information.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We derive a new method for determining size-transition matrices (STMs) that eliminates probabilities of negative growth and accounts for individual variability. STMs are an important part of size-structured models, which are used in the stock assessment of aquatic species. The elements of STMs represent the probability of growth from one size class to another, given a time step. The growth increment over this time step can be modelled with a variety of methods, but when a population construct is assumed for the underlying growth model, the resulting STM may contain entries that predict negative growth. To solve this problem, we use a maximum likelihood method that incorporates individual variability in the asymptotic length, relative age at tagging, and measurement error to obtain von Bertalanffy growth model parameter estimates. The statistical moments for the future length given an individual's previous length measurement and time at liberty are then derived. We moment match the true conditional distributions with skewed-normal distributions and use these to accurately estimate the elements of the STMs. The method is investigated with simulated tag-recapture data and tag-recapture data gathered from the Australian eastern king prawn (Melicertus plebejus).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The contemporary methodology for growth models of organisms is based on continuous trajectories and thus it hinders us from modelling stepwise growth in crustacean populations. Growth models for fish are normally assumed to follow a continuous function, but a different type of model is needed for crustacean growth. Crustaceans must moult in order for them to grow. The growth of crustaceans is a discontinuous process due to the periodical shedding of the exoskeleton in moulting. The stepwise growth of crustaceans through the moulting process makes the growth estimation more complex. Stochastic approaches can be used to model discontinuous growth or what are commonly known as "jumps" (Figure 1). However, in stochastic growth model we need to ensure that the stochastic growth model results in only positive jumps. In view of this, we will introduce a subordinator that is a special case of a Levy process. A subordinator is a non-decreasing Levy process, that will assist in modelling crustacean growth for better understanding of the individual variability and stochasticity in moulting periods and increments. We develop the estimation methods for parameter estimation and illustrate them with the help of a dataset from laboratory experiments. The motivational dataset is from the ornate rock lobster, Panulirus ornatus, which can be found between Australia and Papua New Guinea. Due to the presence of sex effects on the growth (Munday et al., 2004), we estimate the growth parameters separately for each sex. Since all hard parts are shed too often, the exact age determination of a lobster can be challenging. However, the growth parameters for the aforementioned moult processes from tank data being able to estimate through: (i) inter-moult periods, and (ii) moult increment. We will attempt to derive a joint density, which is made up of two functions: one for moult increments and the other for time intervals between moults. We claim these functions are conditionally independent given pre-moult length and the inter-moult periods. The variables moult increments and inter-moult periods are said to be independent because of the Markov property or conditional probability. Hence, the parameters in each function can be estimated separately. Subsequently, we integrate both of the functions through a Monte Carlo method. We can therefore obtain a population mean for crustacean growth (e. g. red curve in Figure 1). [GRAPHICS]

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Estimation of von Bertalanffy growth parameters has received considerable attention in fisheries research. Since Sainsbury (1980, Can. J. Fish. Aquat. Sci. 37: 241-247) much of this research effort has centered on accounting for individual variability in the growth parameters. In this paper we demonstrate that, in analysis of tagging data, Sainsbury's method and its derivatives do not, in general, satisfactorily account for individual variability in growth, leading to inconsistent parameter estimates (the bias does not tend to zero as sample size increases to infinity). The bias arises because these methods do not use appropriate conditional expectations as a basis for estimation. This bias is found to be similar to that of the Fabens method. Such methods would be appropriate only under the assumption that the individual growth parameters that generate the growth increment were independent of the growth parameters that generated the initial length. However, such an assumption would be unrealistic. The results are derived analytically, and illustrated with a simulation study. Until techniques that take full account of the appropriate conditioning have been developed, the effect of individual variability on growth has yet to be fully understood.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose a simple method of constructing quasi-likelihood functions for dependent data based on conditional-mean-variance relationships, and apply the method to estimating the fractal dimension from box-counting data. Simulation studies were carried out to compare this method with the traditional methods. We also applied this technique to real data from fishing grounds in the Gulf of Carpentaria, Australia