908 resultados para Non-dominated sorting genetic algorithms
Resumo:
In vitro regeneration of Arachis retusa was examined for the purpose of germplasm renewal and conservation. Random amplified polymorphic DNA (RAPD) fingerprinting was used to evaluate the genetic stability of plants derived from embryo axes and apical segments. Ten arbitrary decamer primers were screened and five of them were selected. Ninety genomic regions were evaluated, with an average of 18 loci per clone. All amplified segments were monomorphic. The results indicate that recovered plants are genetically stable at the assessed genomic regions and that both regeneration processes are suitable for in vitro germplasm preservation of Arachis species.
Resumo:
The use of comparative genomics to infer genome function relies on the understanding of how different components of the genome change over evolutionary time. The aim of such comparative analysis is to identify conserved, functionally transcribed sequences such as protein-coding genes and non-coding RNA genes, and other functional sequences such as regulatory regions, as well as other genomic features. Here, we have compared the entire human chromosome 21 with syntenic regions of the mouse genome, and have identified a large number of conserved blocks of unknown function. Although previous studies have made similar observations, it is unknown whether these conserved sequences are genes or not. Here we present an extensive experimental and computational analysis of human chromosome 21 in an effort to assign function to sequences conserved between human chromosome 21 (ref. 8) and the syntenic mouse regions. Our data support the presence of a large number of potentially functional non-genic sequences, probably regulatory and structural. The integration of the properties of the conserved components of human chromosome 21 to the rapidly accumulating functional data for this chromosome will improve considerably our understanding of the role of sequence conservation in mammalian genomes.
Resumo:
The objective of this study was to verify the genetic diversity between and within seven populations of Moxotó goat (n = 264) from the States of Pernambuco, Paraíba and Rio Grande do Norte, using RAPD (Random Amplified Polymorphic DNA). Moxotó, as well as other naturalized breeds, suffers genetic losses due to the indiscriminate miscegenation with breeds raised in the Northeast Region of Brazil. The genetic characterization of these genetic resources is essential to conservation and breeding programs. DNA was extracted from lymphocytes using a non-organic protocol. The 16 primers used were selected from 120 decamer oligonucleotide primers and generated 56 polymorphic bands. The analysis of molecular variance (AMOVA) showed that the greater part of total genetic variability (71.55%) was due to differences between individuals within populations, while 21.21% was among populations. The analysis of variance among the pairs of populations demonstrated that the populations located in Floresta, PE x Angicos, RN presented a smaller value of intrapopulational differentiation (8.9%), indicating low genetic variability among them. Nei's genetic distances varied between 0.0546 and 0.1868 in the populations. The dendrogram generated showed that the Canindé breed, used as outgroup, clustered with the populations of Moxotó, indicating a possible common origin of the naturalized goat breeds.
Resumo:
The work presented here is part of a larger study to identify novel technologies and biomarkers for early Alzheimer disease (AD) detection and it focuses on evaluating the suitability of a new approach for early AD diagnosis by non-invasive methods. The purpose is to examine in a pilot study the potential of applying intelligent algorithms to speech features obtained from suspected patients in order to contribute to the improvement of diagnosis of AD and its degree of severity. In this sense, Artificial Neural Networks (ANN) have been used for the automatic classification of the two classes (AD and control subjects). Two human issues have been analyzed for feature selection: Spontaneous Speech and Emotional Response. Not only linear features but also non-linear ones, such as Fractal Dimension, have been explored. The approach is non invasive, low cost and without any side effects. Obtained experimental results were very satisfactory and promising for early diagnosis and classification of AD patients.
Resumo:
Predicting progeny performance from parental genetic divergence can potentially enhance the efficiency of supportive breeding programmes and facilitate risk assessment. Yet, experimental testing of the effects of breeding distance on offspring performance remains rare, especially in wild populations of vertebrates. Recent studies have demonstrated that embryos of salmonid fish are sensitive indicators of additive genetic variance for viability traits. We therefore used gametes of wild brown trout (Salmo trutta) from five genetically distinct populations of a river catchment in Switzerland, and used a full factorial design to produce over 2,000 embryos in 100 different crosses with varying genetic distances (FST range 0.005-0.035). Customized egg capsules allowed recording the survival of individual embryos until hatching under natural field conditions. Our breeding design enabled us to evaluate the role of the environment, of genetic and nongenetic parental contributions, and of interactions between these factors, on embryo viability. We found that embryo survival was strongly affected by maternal environmental (i.e. non-genetic) effects and by the microenvironment, i.e. by the location within the gravel. However, embryo survival was not predicted by population divergence, parental allelic dissimilarity, or heterozygosity, neither in the field nor under laboratory conditions. Our findings suggest that the genetic effects of inter-population hybridization within a genetically differentiated meta-population can be minor in comparison to environmental effects.
Resumo:
BACKGROUND: Active screening by mobile teams is considered the best method for detecting human African trypanosomiasis (HAT) caused by Trypanosoma brucei gambiense but the current funding context in many post-conflict countries limits this approach. As an alternative, non-specialist health care workers (HCWs) in peripheral health facilities could be trained to identify potential cases who need testing based on their symptoms. We explored the predictive value of syndromic referral algorithms to identify symptomatic cases of HAT among a treatment-seeking population in Nimule, South Sudan. METHODOLOGY/PRINCIPAL FINDINGS: Symptom data from 462 patients (27 cases) presenting for a HAT test via passive screening over a 7 month period were collected to construct and evaluate over 14,000 four item syndromic algorithms considered simple enough to be used by peripheral HCWs. For comparison, algorithms developed in other settings were also tested on our data, and a panel of expert HAT clinicians were asked to make referral decisions based on the symptom dataset. The best performing algorithms consisted of three core symptoms (sleep problems, neurological problems and weight loss), with or without a history of oedema, cervical adenopathy or proximity to livestock. They had a sensitivity of 88.9-92.6%, a negative predictive value of up to 98.8% and a positive predictive value in this context of 8.4-8.7%. In terms of sensitivity, these out-performed more complex algorithms identified in other studies, as well as the expert panel. The best-performing algorithm is predicted to identify about 9/10 treatment-seeking HAT cases, though only 1/10 patients referred would test positive. CONCLUSIONS/SIGNIFICANCE: In the absence of regular active screening, improving referrals of HAT patients through other means is essential. Systematic use of syndromic algorithms by peripheral HCWs has the potential to increase case detection and would increase their participation in HAT programmes. The algorithms proposed here, though promising, should be validated elsewhere.
Resumo:
The noise power spectrum (NPS) is the reference metric for understanding the noise content in computed tomography (CT) images. To evaluate the noise properties of clinical multidetector (MDCT) scanners, local 2D and 3D NPSs were computed for different acquisition reconstruction parameters.A 64- and a 128-MDCT scanners were employed. Measurements were performed on a water phantom in axial and helical acquisition modes. CT dose index was identical for both installations. Influence of parameters such as the pitch, the reconstruction filter (soft, standard and bone) and the reconstruction algorithm (filtered-back projection (FBP), adaptive statistical iterative reconstruction (ASIR)) were investigated. Images were also reconstructed in the coronal plane using a reformat process. Then 2D and 3D NPS methods were computed.In axial acquisition mode, the 2D axial NPS showed an important magnitude variation as a function of the z-direction when measured at the phantom center. In helical mode, a directional dependency with lobular shape was observed while the magnitude of the NPS was kept constant. Important effects of the reconstruction filter, pitch and reconstruction algorithm were observed on 3D NPS results for both MDCTs. With ASIR, a reduction of the NPS magnitude and a shift of the NPS peak to the low frequency range were visible. 2D coronal NPS obtained from the reformat images was impacted by the interpolation when compared to 2D coronal NPS obtained from 3D measurements.The noise properties of volume measured in last generation MDCTs was studied using local 3D NPS metric. However, impact of the non-stationarity noise effect may need further investigations.
Resumo:
quantiNemo is an individual-based, genetically explicit stochastic simulation program. It was developed to investigate the effects of selection, mutation, recombination and drift on quantitative traits with varying architectures in structured populations connected by migration and located in a heterogeneous habitat. quantiNemo is highly flexible at various levels: population, selection, trait(s) architecture, genetic map for QTL and/or markers, environment, demography, mating system, etc. quantiNemo is coded in C++ using an object-oriented approach and runs on any computer platform. Availability: Executables for several platforms, user's manual, and source code are freely available under the GNU General Public License at http://www2.unil.ch/popgen/softwares/quantinemo.
Resumo:
BACKGROUND: Non-communicable diseases (NCDs) are increasing worldwide. We hypothesize that environmental factors (including social adversity, diet, lack of physical activity and pollution) can become "embedded" in the biology of humans. We also hypothesize that the "embedding" partly occurs because of epigenetic changes, i.e., durable changes in gene expression patterns. Our concern is that once such factors have a foundation in human biology, they can affect human health (including NCDs) over a long period of time and across generations. OBJECTIVES: To analyze how worldwide changes in movements of goods, persons and lifestyles (globalization) may affect the "epigenetic landscape" of populations and through this have an impact on NCDs. We provide examples of such changes and effects by discussing the potential epigenetic impact of socio-economic status, migration, and diet, as well as the impact of environmental factors influencing trends in age at puberty. DISCUSSION: The study of durable changes in epigenetic patterns has the potential to influence policy and practice; for example, by enabling stratification of populations into those who could particularly benefit from early interventions to prevent NCDs, or by demonstrating mechanisms through which environmental factors influence disease risk, thus providing compelling evidence for policy makers, companies and the civil society at large. The current debate on the '25 × 25 strategy', a goal of 25% reduction in relative mortality from NCDs by 2025, makes the proposed approach even more timely. CONCLUSIONS: Epigenetic modifications related to globalization may crucially contribute to explain current and future patterns of NCDs, and thus deserve attention from environmental researchers, public health experts, policy makers, and concerned citizens.
Resumo:
Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.
Resumo:
The objectives of this work were to study the genetic control of grain yield (GY) and nitrogen (N) use efficiency (NUE, grain yield/N applied) and its primary components, N uptake efficiency (NUpE, N uptake/N applied) and N utilization efficiency (NUtE, grain yield/N uptake), in maize grown in environments with high and low N availability. Experiments with 31 maize genotypes (28 hybrid crosses and three controls) were carried out in soils with high and low N rates, in the southeast of the state of Minas Gerais, Brazil. There was a reduction of 23.2% in average GY for maize grown in soil with low N, in comparison to that obtained with high N. There were 26.5, 199 and 400% increases in NUtE, NUpE, and NUE, respectively, for maize grown with low N. The general combining ability (GCA) and specific combining ability (SCA) were significant for GY, NUE and NUpE for maize grown in high N soil. Only GCA was significant for NUpE for maize grown in low N soil. The GCA and SCA for NUtE were not significant in either environment. Additive and non-additive genetic effects are responsible for the genetic control of NUE and GY for maize grown in soils with high N availability, although additive effects are more important.
Resumo:
Context : It is now clearly shown that genetic factors in association with environment play a key role in obesity and eating disorders. This project studies the clinical symptoms and molecular abnormalities in patients carrying a strong hereditary predisposition to obesity and eating behavior disorders. We have previously published the association between the 16:29.5-30.1 deletion and a very penetrant form of morbid obesity and macrocephaly. We have also demonstrated the association between the reciprocal 16:29.5-30.1 duplication and underweight and small head circumference. These 2 studies demonstrate that gene dosage of one or several genes in this region regulates BMI as well as brain growth. At present, there are no data pointing towards particular candidate genes. We are currently investigating a second non-overlapping recurrent CNV encompassing SH2B1, upstream of the aforementioned rearrangement. SNPs in this gene have been associated with BMI in GWAS studies and mice models confirmed this association. Bokuchova et al have reported an association between deletions encompassing this gene and severe early onset obesity, as well as insulin resistance. We are currently collecting and analyzing data to fully characterize the phenotype and the transcriptional patterns associated with this rearrangement. Aims : 1. Identify carriers of any CNVs in the greater 16p11.2 region (between 16:28MB and 32MB) in the EGG consortium. 2. Perform association studies between SNPs in the greater 16p11.2 region (16:28-32MB) and anthropometric measures with adjusted "locus-wide significance", to identify or prioritize candidate genes potentially driving the association observed in patients with the CNVs (and thus worthy of further validation and sequencing). 3. Explore associations between GSV genome-wide and brain volume. 4. Explore relationship between brain volumes (whole brain and regional for those who underwent brain MRI), head circumference and BMI. 5. Extrapolate this procedure to other regions covered by the Metabochip. Methods : - Examine and collect clinical informations, as well as molecular informations in these patients. - Analysis of MRI data in children and adults with BMI > 2SD. Compare changes to MRI data obtained in patients with monogenic forms of obesity (data from Lausanne study) and to underweight (BMI<-2SD) individuals from EGG. - Test whether opposite extremes of the phenotypic distribution may be highly informative Expected results : This is a highly focused study, pertaining to approximately 1 0/00 of the human genome. Yet it is clear that if successful, the lessons learned from this study could be extrapolated to other segments of the genome and would need validation and replication by additional studies. Altogether they will contribute to further explore the missing heritability and point to etiologic genes and pathways underlying these important health burdens.
Resumo:
BACKGROUND: Primary ciliary dyskinesia (PCD) is characterised by recurrent infections of the upper respiratory airways (nose, bronchi, and frontal sinuses) and randomisation of left-right body asymmetry. To date, PCD is mainly described with autosomal recessive inheritance and mutations have been found in five genes: the dynein arm protein subunits DNAI1, DNAH5 and DNAH11, the kinase TXNDC3, and the X-linked retinitis pigmentosa GTPase regulator RPGR. METHODS: We screened 89 unrelated individuals with PCD for mutations in the coding and splice site regions of the gene DNAH5 by denaturing high performance liquid chromatography (DHPLC) and sequencing. Patients were mainly of European origin and were recruited without any phenotypic preselection. RESULTS: We identified 18 novel (nonsense, splicing, small deletion and missense) and six previously described mutations. Interestingly, these DNAH5 mutations were mainly associated with outer + inner dyneins arm ultrastructural defects (50%). CONCLUSION: Overall, mutations on both alleles of DNAH5 were identified in 15% of our clinically heterogeneous cohort of patients. Although genetic alterations remain to be identified in most patients, DNAH5 is to date the main PCD gene.
Resumo:
OBJECTIVE: Genetic studies might provide new insights into the biological mechanisms underlying lipid metabolism and risk of CAD. We therefore conducted a genome-wide association study to identify novel genetic determinants of low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglycerides. METHODS AND RESULTS: We combined genome-wide association data from 8 studies, comprising up to 17 723 participants with information on circulating lipid concentrations. We did independent replication studies in up to 37 774 participants from 8 populations and also in a population of Indian Asian descent. We also assessed the association between single-nucleotide polymorphisms (SNPs) at lipid loci and risk of CAD in up to 9 633 cases and 38 684 controls. We identified 4 novel genetic loci that showed reproducible associations with lipids (probability values, 1.6×10(-8) to 3.1×10(-10)). These include a potentially functional SNP in the SLC39A8 gene for HDL-C, an SNP near the MYLIP/GMPR and PPP1R3B genes for LDL-C, and at the AFF1 gene for triglycerides. SNPs showing strong statistical association with 1 or more lipid traits at the CELSR2, APOB, APOE-C1-C4-C2 cluster, LPL, ZNF259-APOA5-A4-C3-A1 cluster and TRIB1 loci were also associated with CAD risk (probability values, 1.1×10(-3) to 1.2×10(-9)). CONCLUSIONS: We have identified 4 novel loci associated with circulating lipids. We also show that in addition to those that are largely associated with LDL-C, genetic loci mainly associated with circulating triglycerides and HDL-C are also associated with risk of CAD. These findings potentially provide new insights into the biological mechanisms underlying lipid metabolism and CAD risk.
Resumo:
As modern molecular biology moves towards the analysis of biological systems as opposed to their individual components, the need for appropriate mathematical and computational techniques for understanding the dynamics and structure of such systems is becoming more pressing. For example, the modeling of biochemical systems using ordinary differential equations (ODEs) based on high-throughput, time-dense profiles is becoming more common-place, which is necessitating the development of improved techniques to estimate model parameters from such data. Due to the high dimensionality of this estimation problem, straight-forward optimization strategies rarely produce correct parameter values, and hence current methods tend to utilize genetic/evolutionary algorithms to perform non-linear parameter fitting. Here, we describe a completely deterministic approach, which is based on interval analysis. This allows us to examine entire sets of parameters, and thus to exhaust the global search within a finite number of steps. In particular, we show how our method may be applied to a generic class of ODEs used for modeling biochemical systems called Generalized Mass Action Models (GMAs). In addition, we show that for GMAs our method is amenable to the technique in interval arithmetic called constraint propagation, which allows great improvement of its efficiency. To illustrate the applicability of our method we apply it to some networks of biochemical reactions appearing in the literature, showing in particular that, in addition to estimating system parameters in the absence of noise, our method may also be used to recover the topology of these networks.