141 resultados para Algorithms genetics
em Université de Lausanne, Switzerland
Resumo:
ABSTRACT: BACKGROUND: Serologic testing algorithms for recent HIV seroconversion (STARHS) provide important information for HIV surveillance. We have shown that a patient's antibody reaction in a confirmatory line immunoassay (INNO-LIATM HIV I/II Score, Innogenetics) provides information on the duration of infection. Here, we sought to further investigate the diagnostic specificity of various Inno-Lia algorithms and to identify factors affecting it. METHODS: Plasma samples of 714 selected patients of the Swiss HIV Cohort Study infected for longer than 12 months and representing all viral clades and stages of chronic HIV-1 infection were tested blindly by Inno-Lia and classified as either incident (up to 12 m) or older infection by 24 different algorithms. Of the total, 524 patients received HAART, 308 had HIV-1 RNA below 50 copies/mL, and 620 were infected by a HIV-1 non-B clade. Using logistic regression analysis we evaluated factors that might affect the specificity of these algorithms. RESULTS: HIV-1 RNA <50 copies/mL was associated with significantly lower reactivity to all five HIV-1 antigens of the Inno-Lia and impaired specificity of most algorithms. Among 412 patients either untreated or with HIV-1 RNA ≥50 copies/mL despite HAART, the median specificity of the algorithms was 96.5% (range 92.0-100%). The only factor that significantly promoted false-incident results in this group was age, with false-incident results increasing by a few percent per additional year. HIV-1 clade, HIV-1 RNA, CD4 percentage, sex, disease stage, and testing modalities exhibited no significance. Results were similar among 190 untreated patients. CONCLUSIONS: The specificity of most Inno-Lia algorithms was high and not affected by HIV-1 variability, advanced disease and other factors promoting false-recent results in other STARHS. Specificity should be good in any group of untreated HIV-1 patients.
Resumo:
For the last 2 decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially with regard to the supermatrix approach that is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical data set (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and computational time required by the algorithm. Additional analyses were also conducted on a reduced data set to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the matrix representation with parsimony (MRP), MinFlip, and MinCut methods performed well according to our criteria, whereas the average consensus, split fit, and most similar supertree methods showed a poorer performance or at least did not behave the same way as the total evidence tree. Results for the super distance matrix, that is, the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip, and MinCut. The output of each method was only slightly improved when applied to the reduced data set, suggesting a correct behavior of the heuristic searches and a relatively low sensitivity of the algorithms to data set sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardized heuristic search for all methods and the increase in computing power to handle large data sets. The latter would prove to be particularly useful for promising approaches such as the maximum quartet fit method that yet requires substantial computing power.
Resumo:
Regulatory gene networks contain generic modules, like those involving feedback loops, which are essential for the regulation of many biological functions (Guido et al. in Nature 439:856-860, 2006). We consider a class of self-regulated genes which are the building blocks of many regulatory gene networks, and study the steady-state distribution of the associated Gillespie algorithm by providing efficient numerical algorithms. We also study a regulatory gene network of interest in gene therapy, using mean-field models with time delays. Convergence of the related time-nonhomogeneous Markov chain is established for a class of linear catalytic networks with feedback loops.
Resumo:
The dispersal process, by which individuals or other dispersing agents such as gametes or seeds move from birthplace to a new settlement locality, has important consequences for the dynamics of genes, individuals, and species. Many of the questions addressed by ecology and evolutionary biology require a good understanding of species' dispersal patterns. Much effort has thus been devoted to overcoming the difficulties associated with dispersal measurement. In this context, genetic tools have long been the focus of intensive research, providing a great variety of potential solutions to measuring dispersal. This methodological diversity is reviewed here to help (molecular) ecologists find their way toward dispersal inference and interpretation and to stimulate further developments.
Resumo:
The algorithmic approach to data modelling has developed rapidly these last years, in particular methods based on data mining and machine learning have been used in a growing number of applications. These methods follow a data-driven methodology, aiming at providing the best possible generalization and predictive abilities instead of concentrating on the properties of the data model. One of the most successful groups of such methods is known as Support Vector algorithms. Following the fruitful developments in applying Support Vector algorithms to spatial data, this paper introduces a new extension of the traditional support vector regression (SVR) algorithm. This extension allows for the simultaneous modelling of environmental data at several spatial scales. The joint influence of environmental processes presenting different patterns at different scales is here learned automatically from data, providing the optimum mixture of short and large-scale models. The method is adaptive to the spatial scale of the data. With this advantage, it can provide efficient means to model local anomalies that may typically arise in situations at an early phase of an environmental emergency. However, the proposed approach still requires some prior knowledge on the possible existence of such short-scale patterns. This is a possible limitation of the method for its implementation in early warning systems. The purpose of this paper is to present the multi-scale SVR model and to illustrate its use with an application to the mapping of Cs137 activity given the measurements taken in the region of Briansk following the Chernobyl accident.
Resumo:
A new study shows that wood ant queens selectively pass the maternally-inherited half of their genome to their daughters and the paternally-inherited half to their sons. This system, which most likely evolved from ancestral hybridization, creates distinct genetic lineages.
Resumo:
Hypertension is a common, modifiable and heritable cardiovascular risk factor. Some rare monogenic forms of hypertension have been described, but the majority of patients suffer from "essential" hypertension, for whom the underlying pathophysiological mechanism is not clear. Essential hypertension is a complex trait, involving multiple genes and environmental factors. Recently, progress in the identification of common genetic variants associated with blood pressure and hypertension has been made thanks to large-scale international collaborative projects involving geneticists, epidemiologists, statisticians and clinicians. In this article, we review some basic genetic concepts and the main research methods used to study the genetics of hypertension, as well as selected recent findings in this field.
Resumo:
The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5' and 3' transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network.
Resumo:
Purpose of review: Elucidating the genetic background of Parkinson disease and essential tremor is crucial to understand the pathogenesis and improve diagnostic and therapeutic strategies. Recent findings: A number of approaches have been applied including familial and association studies, and studies of gene expression profiles to identify genes involved in susceptibility to Parkinson disease. These studies have nominated a number of candidate Parkinson disease genes and novel loci including Omi/HtrA2, GIGYF2, FGF20, PDXK, EIF4G1 and PARK16. A recent notable finding has been the confirmation for the role of heterozygous mutations in glucocerebrosidase (GBA) as risk factors for Parkinson disease. Finally, association studies have nominated genetic variation in the leucine-rich repeat and Ig containing 1 gene (LINGO1) as a risk for both Parkinson disease and essential tremor, providing the first genetic evidence of a link between the two conditions. Summary: Although undoubtedly genes remain to be identified, considerable progress has been achieved in the understanding of the genetic basis of Parkinson disease. This same effort is now required for essential tremor. The use of next-generation high-throughput sequencing and genotyping technologies will help pave the way for future insight leading to advances in diagnosis, prevention and cure.
Resumo:
Trait decay may occur when selective pressures shift, owing to changes in environment or life style, rendering formerly adaptive traits non-functional or even maladaptive. It remains largely unknown if such decay would stem from multiple mutations with small effects or rather involve few loci with major phenotypic effects. Here, we investigate the decay of female sexual traits, and the genetic causes thereof, in a transition from haplodiploid sexual reproduction to endosymbiont-induced asexual reproduction in the parasitoid wasp Asobara japonica. We take advantage of the fact that asexual females cured of their endosymbionts produce sons instead of daughters, and that these sons can be crossed with sexual females. By combining behavioral experiments with crosses designed to introgress alleles from the asexual into the sexual genome, we found that sexual attractiveness, mating, egg fertilization and plastic adjustment of offspring sex ratio (in response to variation in local mate competition) are decayed in asexual A. japonica females. Furthermore, introgression experiments revealed that the propensity for cured asexual females to produce only sons (because of decayed sexual attractiveness, mating behavior and/or egg fertilization) is likely caused by recessive genetic effects at a single locus. Recessive effects were also found to cause decay of plastic sex-ratio adjustment under variable levels of local mate competition. Our results suggest that few recessive mutations drive decay of female sexual traits, at least in asexual species deriving from haplodiploid sexual ancestors.
Resumo:
Introduction: As part of the MicroArray Quality Control (MAQC)-II project, this analysis examines how the choice of univariate feature-selection methods and classification algorithms may influence the performance of genomic predictors under varying degrees of prediction difficulty represented by three clinically relevant endpoints. Methods: We used gene-expression data from 230 breast cancers (grouped into training and independent validation sets), and we examined 40 predictors (five univariate feature-selection methods combined with eight different classifiers) for each of the three endpoints. Their classification performance was estimated on the training set by using two different resampling methods and compared with the accuracy observed in the independent validation set. Results: A ranking of the three classification problems was obtained, and the performance of 120 models was estimated and assessed on an independent validation set. The bootstrapping estimates were closer to the validation performance than were the cross-validation estimates. The required sample size for each endpoint was estimated, and both gene-level and pathway-level analyses were performed on the obtained models. Conclusions: We showed that genomic predictor accuracy is determined largely by an interplay between sample size and classification difficulty. Variations on univariate feature-selection methods and choice of classification algorithm have only a modest impact on predictor performance, and several statistically equally good predictors can be developed for any given classification problem.
Resumo:
Microarray transcript profiling and RNA interference are two new technologies crucial for large-scale gene function studies in multicellular eukaryotes. Both rely on sequence-specific hybridization between complementary nucleic acid strands, inciting us to create a collection of gene-specific sequence tags (GSTs) representing at least 21,500 Arabidopsis genes and which are compatible with both approaches. The GSTs were carefully selected to ensure that each of them shared no significant similarity with any other region in the Arabidopsis genome. They were synthesized by PCR amplification from genomic DNA. Spotted microarrays fabricated from the GSTs show good dynamic range, specificity, and sensitivity in transcript profiling experiments. The GSTs have also been transferred to bacterial plasmid vectors via recombinational cloning protocols. These cloned GSTs constitute the ideal starting point for a variety of functional approaches, including reverse genetics. We have subcloned GSTs on a large scale into vectors designed for gene silencing in plant cells. We show that in planta expression of GST hairpin RNA results in the expected phenotypes in silenced Arabidopsis lines. These versatile GST resources provide novel and powerful tools for functional genomics.
Resumo:
BACKGROUND: The population genetic structure of a parasite, and consequently its ability to adapt to a given host, is strongly linked to its own life history as well as the life history of its host. While the effects of parasite life history on their population genetic structure have received some attention, the effect of host social system has remained largely unstudied. In this study, we investigated the population genetic structure of two closely related parasitic mite species (Spinturnix myoti and Spinturnix bechsteini) with very similar life histories. Their respective hosts, the greater mouse-eared bat (Myotis myotis) and the Bechstein's bat (Myotis bechsteinii) have social systems that differ in several substantial features, such as group size, mating system and dispersal patterns. RESULTS: We found that the two mite species have strongly differing population genetic structures. In S. myoti we found high levels of genetic diversity and very little pairwise differentiation, whereas in S. bechsteini we observed much less diversity, strongly differentiated populations and strong temporal turnover. These differences are likely to be the result of the differences in genetic drift and dispersal opportunities afforded to the two parasites by the different social systems of their hosts. CONCLUSIONS: Our results suggest that host social system can strongly influence parasite population structure. As a result, the evolutionary potential of these two parasites with very similar life histories also differs, thereby affecting the risk and evolutionary pressure exerted by each parasite on its host.
Resumo:
Defining an efficient training set is one of the most delicate phases for the success of remote sensing image classification routines. The complexity of the problem, the limited temporal and financial resources, as well as the high intraclass variance can make an algorithm fail if it is trained with a suboptimal dataset. Active learning aims at building efficient training sets by iteratively improving the model performance through sampling. A user-defined heuristic ranks the unlabeled pixels according to a function of the uncertainty of their class membership and then the user is asked to provide labels for the most uncertain pixels. This paper reviews and tests the main families of active learning algorithms: committee, large margin, and posterior probability-based. For each of them, the most recent advances in the remote sensing community are discussed and some heuristics are detailed and tested. Several challenging remote sensing scenarios are considered, including very high spatial resolution and hyperspectral image classification. Finally, guidelines for choosing the good architecture are provided for new and/or unexperienced user.