Biblioteca Digital

29 resultados para genome wide complex trait analysis

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)

Genome wide scan for quantitative trait loci affecting tick resistance in cattle (Bos taurus x Bos indicus)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background : In tropical countries, losses caused by bovine tick Rhipicephalus (Boophilus) microplus infestation have a tremendous economic impact on cattle production systems. Genetic variation between Bos taurus and Bos indicus to tick resistance and molecular biology tools might allow for the identification of molecular markers linked to resistance traits that could be used as an auxiliary tool in selection programs. The objective of this work was to identify QTL associated with tick resistance/susceptibility in a bovine F2 population derived from the Gyr (Bos indicus) x Holstein (Bos taurus) cross. Results: Through a whole genome scan with microsatellite markers, we were able to map six genomic regions associated with bovine tick resistance. For most QTL, we have found that depending on the tick evaluation season (dry and rainy) different sets of genes could be involved in the resistance mechanism. We identified dry season specific QTL on BTA 2 and 10, rainy season specific QTL on BTA 5, 11 and 27. We also found a highly significant genome wide QTL for both dry and rainy seasons in the central region of BTA 23. Conclusions: The experimental F2 population derived from Gyr x Holstein cross successfully allowed the identification of six highly significant QTL associated with tick resistance in cattle. QTL located on BTA 23 might be related with the bovine histocompatibility complex. Further investigation of these QTL will help to isolate candidate genes involved with tick resistance in cattle.

Strategies for genetic model specification in the screening of genome-wide meta-analysis signals for further replication

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Meta-analysis is increasingly being employed as a screening procedure in large-scale association studies to select promising variants for follow-up studies. However, standard methods for meta-analysis require the assumption of an underlying genetic model, which is typically unknown a priori. This drawback can introduce model misspecifications, causing power to be suboptimal, or the evaluation of multiple genetic models, which augments the number of false-positive associations, ultimately leading to waste of resources with fruitless replication studies. We used simulated meta-analyses of large genetic association studies to investigate naive strategies of genetic model specification to optimize screenings of genome-wide meta-analysis signals for further replication. Methods Different methods, meta-analytical models and strategies were compared in terms of power and type-I error. Simulations were carried out for a binary trait in a wide range of true genetic models, genome-wide thresholds, minor allele frequencies (MAFs), odds ratios and between-study heterogeneity (tau(2)). Results Among the investigated strategies, a simple Bonferroni-corrected approach that fits both multiplicative and recessive models was found to be optimal in most examined scenarios, reducing the likelihood of false discoveries and enhancing power in scenarios with small MAFs either in the presence or in absence of heterogeneity. Nonetheless, this strategy is sensitive to tau(2) whenever the susceptibility allele is common (MAF epsilon 30%), resulting in an increased number of false-positive associations compared with an analysis that considers only the multiplicative model. Conclusion Invoking a simple Bonferroni adjustment and testing for both multiplicative and recessive models is fast and an optimal strategy in large meta-analysis-based screenings. However, care must be taken when examined variants are common, where specification of a multiplicative model alone may be preferable.

A Genome-Wide Association Study of Upper Aerodigestive Tract Cancers Conducted within the INHANCE Consortium

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genome-wide association studies (GWAS) have been successful in identifying common genetic variation involved in susceptibility to etiologically complex disease. We conducted a GWAS to identify common genetic variation involved in susceptibility to upper aero-digestive tract (UADT) cancers. Genome-wide genotyping was carried out using the Illumina HumanHap300 beadchips in 2,091 UADT cancer cases and 3,513 controls from two large European multi-centre UADT cancer studies, as well as 4,821 generic controls. The 19 top-ranked variants were investigated further in an additional 6,514 UADT cancer cases and 7,892 controls of European descent from an additional 13 UADT cancer studies participating in the INHANCE consortium. Five common variants presented evidence for significant association in the combined analysis (p <= 5 x 10(-7)). Two novel variants were identified, a 4q21 variant (rs1494961, p = 1 x 10(-8)) located near DNA repair related genes HEL308 and FAM175A (or Abraxas) and a 12q24 variant (rs4767364, p = 2 x 10(-8)) located in an extended linkage disequilibrium region that contains multiple genes including the aldehyde dehydrogenase 2 (ALDH2) gene. Three remaining variants are located in the ADH gene cluster and were identified previously in a candidate gene study involving some of these samples. The association between these three variants and UADT cancers was independently replicated in 5,092 UADT cancer cases and 6,794 controls non-overlapping samples presented here (rs1573496-ADH7, p = 5 x 10(-8); rs1229984-ADH1B, p = 7 x 10(-9); and rs698-ADH1C, p = 0.02). These results implicate two variants at 4q21 and 12q24 and further highlight three ADH variants in UADT cancer susceptibility.

An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associations

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Genome wide association studies (GWAS) are becoming the approach of choice to identify genetic determinants of complex phenotypes and common diseases. The astonishing amount of generated data and the use of distinct genotyping platforms with variable genomic coverage are still analytical challenges. Imputation algorithms combine directly genotyped markers information with haplotypic structure for the population of interest for the inference of a badly genotyped or missing marker and are considered a near zero cost approach to allow the comparison and combination of data generated in different studies. Several reports stated that imputed markers have an overall acceptable accuracy but no published report has performed a pair wise comparison of imputed and empiric association statistics of a complete set of GWAS markers. Results: In this report we identified a total of 73 imputed markers that yielded a nominally statistically significant association at P < 10(-5) for type 2 Diabetes Mellitus and compared them with results obtained based on empirical allelic frequencies. Interestingly, despite their overall high correlation, association statistics based on imputed frequencies were discordant in 35 of the 73 (47%) associated markers, considerably inflating the type I error rate of imputed markers. We comprehensively tested several quality thresholds, the haplotypic structure underlying imputed markers and the use of flanking markers as predictors of inaccurate association statistics derived from imputed markers. Conclusions: Our results suggest that association statistics from imputed markers showing specific MAF (Minor Allele Frequencies) range, located in weak linkage disequilibrium blocks or strongly deviating from local patterns of association are prone to have inflated false positive association signals. The present study highlights the potential of imputation procedures and proposes simple procedures for selecting the best imputed markers for follow-up genotyping studies.

Genome-wide SNP genotyping highlights the role of natural selection in Plasmodium falciparum population divergence

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The malaria parasite Plasmodium falciparum exhibits abundant genetic diversity, and this diversity is key to its success as a pathogen. Previous efforts to study genetic diversity in P. falciparum have begun to elucidate the demographic history of the species, as well as patterns of population structure and patterns of linkage disequilibrium within its genome. Such studies will be greatly enhanced by new genomic tools and recent large-scale efforts to map genomic variation. To that end, we have developed a high throughput single nucleotide polymorphism (SNP) genotyping platform for P. falciparum. Results: Using an Affymetrix 3,000 SNP assay array, we found roughly half the assays (1,638) yielded high quality, 100% accurate genotyping calls for both major and minor SNP alleles. Genotype data from 76 global isolates confirm significant genetic differentiation among continental populations and varying levels of SNP diversity and linkage disequilibrium according to geographic location and local epidemiological factors. We further discovered that nonsynonymous and silent (synonymous or noncoding) SNPs differ with respect to within-population diversity, interpopulation differentiation, and the degree to which allele frequencies are correlated between populations. Conclusions: The distinct population profile of nonsynonymous variants indicates that natural selection has a significant influence on genomic diversity in P. falciparum, and that many of these changes may reflect functional variants deserving of follow-up study. Our analysis demonstrates the potential for new high-throughput genotyping technologies to enhance studies of population structure, natural selection, and ultimately enable genome-wide association studies in P. falciparum to find genes underlying key phenotypic traits.

Single-nucleotide polymorphism, linkage disequilibrium and geographic structure in the malaria parasite Plasmodium vivax: prospects for genome-wide association studies

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The ideal malaria parasite populations for initial mapping of genomic regions contributing to phenotypes such as drug resistance and virulence, through genome-wide association studies, are those with high genetic diversity, allowing for numerous informative markers, and rare meiotic recombination, allowing for strong linkage disequilibrium (LD) between markers and phenotype-determining loci. However, levels of genetic diversity and LD in field populations of the major human malaria parasite P. vivax remain little characterized. Results: We examined single-nucleotide polymorphisms (SNPs) and LD patterns across a 100-kb chromosome segment of P. vivax in 238 field isolates from areas of low to moderate malaria endemicity in South America and Asia, where LD tends to be more extensive than in holoendemic populations, and in two monkey-adapted strains (Salvador-I, from El Salvador, and Belem, from Brazil). We found varying levels of SNP diversity and LD across populations, with the highest diversity and strongest LD in the area of lowest malaria transmission. We found several clusters of contiguous markers with rare meiotic recombination and characterized a relatively conserved haplotype structure among populations, suggesting the existence of recombination hotspots in the genome region analyzed. Both silent and nonsynonymous SNPs revealed substantial between-population differentiation, which accounted for similar to 40% of the overall genetic diversity observed. Although parasites clustered according to their continental origin, we found evidence for substructure within the Brazilian population of P. vivax. We also explored between-population differentiation patterns revealed by loci putatively affected by natural selection and found marked geographic variation in frequencies of nucleotide substitutions at the pvmdr-1 locus, putatively associated with drug resistance. Conclusion: These findings support the feasibility of genome-wide association studies in carefully selected populations of P. vivax, using relatively low densities of markers, but underscore the risk of false positives caused by population structure at both local and regional levels.

Genome-Wide Detection of Serpentine Receptor-Like Proteins in Malaria Parasites

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Serpentine receptors comprise a large family of membrane receptors distributed over diverse organisms, such as bacteria, fungi, plants and all metazoans. However, the presence of serpentine receptors in protozoan parasites is largely unknown so far. In the present study we performed a genome-wide search for proteins containing seven transmembrane domains (7TM) in the human malaria parasite Plasmodium falciparum and identified four serpentine receptor-like proteins. These proteins, denoted PfSR1, PfSR10, PfSR12 and PfSR25, show membrane topologies that resemble those exhibited by members belonging to different families of serpentine receptors. Expression of the pfsrs genes was detected by Real Time PCR in P. falciparum intraerythrocytic stages, indicating that they potentially code for functional proteins. We also found corresponding homologues for the PfSRs in five other Plasmodium species, two primate and three rodent parasites. PfSR10 and 25 are the most conserved receptors among the different species, while PfSR1 and 12 are more divergent. Interestingly, we found that PfSR10 and PfSR12 possess similarity to orphan serpentine receptors of other organisms. The identification of potential parasite membrane receptors raises a new perspective for essential aspects of malaria parasite host cell infection.

Complex networks analysis of manual and machine translations

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Complex networks have been increasingly used in text analysis, including in connection with natural language processing tools, as important text features appear to be captured by the topology and dynamics of the networks. Following previous works that apply complex networks concepts to text quality measurement, summary evaluation, and author characterization, we now focus on machine translation (MT). In this paper we assess the possible representation of texts as complex networks to evaluate cross-linguistic issues inherent in manual and machine translation. We show that different quality translations generated by NIT tools can be distinguished from their manual counterparts by means of metrics such as in-(ID) and out-degrees (OD), clustering coefficient (CC), and shortest paths (SP). For instance, we demonstrate that the average OD in networks of automatic translations consistently exceeds the values obtained for manual ones, and that the CC values of source texts are not preserved for manual translations, but are for good automatic translations. This probably reflects the text rearrangements humans perform during manual translation. We envisage that such findings could lead to better NIT tools and automatic evaluation metrics.

Association between a 15q25 gene variant, smoking quantity and tobacco-related cancers among 17 000 individuals

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Methods We performed a detailed analysis of one 15q single nucleotide polymorphism (SNP) (rs16969968) with smoking behaviour and cancer risk in a total of 17 300 subjects from five LC studies and four upper aerodigestive tract (UADT) cancer studies. Results Subjects with one minor allele smoked on average 0.3 cigarettes per day (CPD) more, whereas subjects with the homozygous minor AA genotype smoked on average 1.2 CPD more than subjects with a GG genotype (P < 0.001). The variant was associated with heavy smoking (> 20 CPD) [odds ratio (OR) = 1.13, 95% confidence interval (CI) 0.96-1.34, P = 0.13 for heterozygotes and 1.81, 95% CI 1.39-2.35 for homozygotes, P < 0.0001]. The strong association between the variant and LC risk (OR = 1.30, 95% CI 1.23-1.38, P = 1 x 10(-18)), was virtually unchanged after adjusting for this smoking association (smoking adjusted OR = 1.27, 95% CI 1.19-1.35, P = 5 x 10(-13)). Furthermore, we found an association between the variant allele and an earlier age of LC onset (P = 0.02). The association was also noted in UADT cancers (OR = 1.08, 95% CI 1.01-1.15, P = 0.02). Genome wide association (GWA) analysis of over 300 000 SNPs on 11 219 subjects did not identify any additional variants related to smoking behaviour. Conclusions This study confirms the strong association between 15q gene variants and LC and shows an independent association with smoking quantity, as well as an association with UADT cancers.

High-throughput SNP genotyping in the highly heterozygous genome of Eucalyptus: assay success, polymorphism and transferability across species

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results: We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions: This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity >= 2%. The development of a much larger array of informative SNPs across multiple Eucalyptus species is feasible, although strongly dependent on having a representative and sufficiently deep collection of sequences from many individuals of each target species. A higher density SNP platform will be instrumental to undertake genome-wide phylogenetic and population genomics studies and to implement molecular breeding by Genomic Selection in Eucalyptus.

Genomic Analysis of Wild Tomato Introgressions Determining Metabolism- and Yield-Associated Traits

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the aim of determining the genetic basis of metabolic regulation in tomato fruit, we constructed a detailed physical map of genomic regions spanning previously described metabolic quantitative trait loci of a Solanum pennellii introgression line population. Two genomic libraries from S. pennellii were screened with 104 colocated markers from five selected genomic regions, and a total of 614 bacterial artificial chromosome (BAC)/cosmids were identified as seed clones. Integration of sequence data with the genetic and physical maps of Solanum lycopersicum facilitated the anchoring of 374 of these BAC/cosmid clones. The analysis of this information resulted in a genome-wide map of a nondomesticated plant species and covers 10% of the physical distance of the selected regions corresponding to approximately 1% of the wild tomato genome. Comparative analyses revealed that S. pennellii and domesticated tomato genomes can be considered as largely colinear. A total of 1,238,705 bp from both BAC/cosmid ends and nine large insert clones were sequenced, annotated, and functionally categorized. The sequence data allowed the evaluation of the level of polymorphism between the wild and cultivated tomato species. An exhaustive microsynteny analysis allowed us to estimate the divergence date of S. pennellii and S. lycopersicum at 2.7 million years ago. The combined results serve as a reference for comparative studies both at the macrosyntenic and microsyntenic levels. They also provide a valuable tool for fine-mapping of quantitative trait loci in tomato. Furthermore, they will contribute to a deeper understanding of the regulatory factors underpinning metabolism and hence defining crop chemical composition.

Xylella fastidiosa gene expression analysis by DNA microarrays

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Xylella fastidiosa genome sequencing has generated valuable data by identifying genes acting either on metabolic pathways or in associated pathogenicity and virulence. Based on available information on these genes, new strategies for studying their expression patterns, such as microarray technology, were employed. A total of 2,600 primer pairs were synthesized and then used to generate fragments using the PCR technique. The arrays were hybridized against cDNAs labeled during reverse transcription reactions and which were obtained from bacteria grown under two different conditions (liquid XDM2 and liquid BCYE). All data were statistically analyzed to verify which genes were differentially expressed. In addition to exploring conditions for X. fastidiosa genome-wide transcriptome analysis, the present work observed the differential expression of several classes of genes (energy, protein, amino acid and nucleotide metabolism, transport, degradation of substances, toxins and hypothetical proteins, among others). The understanding of expressed genes in these two different media will be useful in comprehending the metabolic characteristics of X. fastidiosa, and in evaluating how important certain genes are for the functioning and survival of these bacteria in plants.

An assessment of population structure in eight breeds of cattle using a whole genome SNP panel

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Analyses of population structure and breed diversity have provided insight into the origin and evolution of cattle. Previously, these studies have used a low density of microsatellite markers, however, with the large number of single nucleotide polymorphism markers that are now available, it is possible to perform genome wide population genetic analyses in cattle. In this study, we used a high-density panel of SNP markers to examine population structure and diversity among eight cattle breeds sampled from Bos indicus and Bos taurus. Results: Two thousand six hundred and forty one single nucleotide polymorphisms ( SNPs) spanning all of the bovine autosomal genome were genotyped in Angus, Brahman, Charolais, Dutch Black and White Dairy, Holstein, Japanese Black, Limousin and Nelore cattle. Population structure was examined using the linkage model in the program STRUCTURE and Fst estimates were used to construct a neighbor-joining tree to represent the phylogenetic relationship among these breeds. Conclusion: The whole-genome SNP panel identified several levels of population substructure in the set of examined cattle breeds. The greatest level of genetic differentiation was detected between the Bos taurus and Bos indicus breeds. When the Bos indicus breeds were excluded from the analysis, genetic differences among beef versus dairy and European versus Asian breeds were detected among the Bos taurus breeds. Exploration of the number of SNP loci required to differentiate between breeds showed that for 100 SNP loci, individuals could only be correctly clustered into breeds 50% of the time, thus a large number of SNP markers are required to replace the 30 microsatellite markers that are currently commonly used in genetic diversity studies.

Genetic Background of Patients from a University Medical Center in Manhattan: Implications for Personalized Medicine

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The rapid progress currently being made in genomic science has created interest in potential clinical applications; however, formal translational research has been limited thus far. Studies of population genetics have demonstrated substantial variation in allele frequencies and haplotype structure at loci of medical relevance and the genetic background of patient cohorts may often be complex. Methods and Findings: To describe the heterogeneity in an unselected clinical sample we used the Affymetrix 6.0 gene array chip to genotype self-identified European Americans (N = 326), African Americans (N = 324) and Hispanics (N = 327) from the medical practice of Mount Sinai Medical Center in Manhattan, NY. Additional data from US minority groups and Brazil were used for external comparison. Substantial variation in ancestral origin was observed for both African Americans and Hispanics; data from the latter group overlapped with both Mexican Americans and Brazilians in the external data sets. A pooled analysis of the African Americans and Hispanics from NY demonstrated a broad continuum of ancestral origin making classification by race/ethnicity uninformative. Selected loci harboring variants associated with medical traits and drug response confirmed substantial within-and between-group heterogeneity. Conclusion: As a consequence of these complementary levels of heterogeneity group labels offered no guidance at the individual level. These findings demonstrate the complexity involved in clinical translation of the results from genome-wide association studies and suggest that in the genomic era conventional racial/ethnic labels are of little value.

Determination and Molecular Analysis of the Complete Genome Sequence of Two Wild-Type Rabies Viruses Isolated from a Haematophagous Bat and a Frugivorous Bat in Brazil

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The complete genome sequences of two Brazilian wild-type rabies viruses (RABV), a BR-DR1 isolate from a haematophagous bat (Desmodus rotundus) and a BR-AL1 isolate from a frugivorous bat (Artibeus lituratus), were determined. The genomes of the BR-DR1 and RR-AL1 had 11,923 and 11,922 nt, respectively, and both encoded the five standard genes of rhabdoviruses. The complete nucleotide sequence identity between the BR-DR1 and BR-AL1 isolates was 97%. The BR-DR1 and BR-AL1 isolates had some conserved functional sites revealed by the fixed isolates, whereas both isolates had unique amino acid substitutions in the antigenic region IV of the nucleocapsid gene. Therefore, it is speculated that both isolates were nearly identical in virologic character. According to our phylogenetic analysis based on the complete genomes, both isolates belonged to genotype 1, and to the previously defined ""vampire bat-related RABV lineage"" which consisted of mainly D. rotundus- and A. lituratus- isolates; however, a branch pattern with high bootstrap values suggested that BR-DR1 was more closely related to the 9001FRA isolate, which was collected from a dog bitten by a bat in French Guiana, than to BR-AL1. This result suggests that the vampire bat-related RABV lineage includes Brazilian vampire bat and Brazilian frugivorous bat RABV and is further divided into Brazilian vampire bat and Brazilian frugivorous bat RABV sub-lineages. The phylogenetic analysis based on the complete genomes was valuable in discriminating among very closely related isolates.

«
1
2
»