934 resultados para Whole-Genome Association


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Topological measures of large-scale complex networks are applied to a specific artificial regulatory network model created through a whole genome duplication and divergence mechanism. This class of networks share topological features with natural transcriptional regulatory networks. Specifically, these networks display scale-free and small-world topology and possess subgraph distributions similar to those of natural networks. Thus, the topologies inherent in natural networks may be in part due to their method of creation rather than being exclusively shaped by subsequent evolution under selection. The evolvability of the dynamics of these networks is also examined by evolving networks in simulation to obtain three simple types of output dynamics. The networks obtained from this process show a wide variety of topologies and numbers of genes indicating that it is relatively easy to evolve these classes of dynamics in this model. (c) 2006 Elsevier Ireland Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Plant reproduction depends on the concerted activation of many genes to ensure correct communication between pollen and pistil. Here, we queried the whole transcriptome of Arabidopsis (Arabidopsis thaliana) in order to identify genes with specific reproductive functions. We used the Affymetrix ATH1 whole genome array to profile wild-type unpollinated pistils and unfertilized ovules. By comparing the expression profile of pistils at 0.5, 3.5, and 8.0 h after pollination and applying a number of statistical and bioinformatics criteria, we found 1,373 genes differentially regulated during pollen-pistil interactions. Robust clustering analysis grouped these genes in 16 time-course clusters representing distinct patterns of regulation. Coregulation within each cluster suggests the presence of distinct genetic pathways, which might be under the control of specific transcriptional regulators. A total of 78% of the regulated genes were expressed initially in unpollinated pistil and/or ovules, 15% were initially detected in the pollen data sets as enriched or preferentially expressed, and 7% were induced upon pollination. Among those, we found a particular enrichment for unknown transcripts predicted to encode secreted proteins or representing signaling and cell wall-related proteins, which may function by remodeling the extracellular matrix or as extracellular signaling molecules. A strict regulatory control in various metabolic pathways suggests that fine-tuning of the biochemical and physiological cellular environment is crucial for reproductive success. Our study provides a unique and detailed temporal and spatial gene expression profile of in vivo pollen-pistil interactions, providing a framework to better understand the basis of the molecular mechanisms operating during the reproductive process in higher plants.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Klebsiella pneumoniae U25 is a multidrug resistant strain isolated from a tertiary care hospital in Chennai, India. Here, we report the complete annotated genome sequence of strain U25 obtained using PacBio RSII. This is the first report of the whole genome of K. pneumoniae species from Chennai. It consists of a single circular chromosome of size 5,491,870-bp and two plasmids of size 211,813 and 172,619-bp. The genes associated with multidrug resistance were identified. The chromosome of U25 was found to have eight antibiotic resistant genes [blaOXA-1, blaSHV-28, aac(6’)1b-cr, catB3, oqxAB, dfrA1]. The plasmid pMGRU25-001 was found to have only one resistant gene (catA1) while plasmid pMGRU25-002 had 20 resistant genes [strAB, aadA1, aac(6’)-Ib, aac(3)-IId, sul1,2, blaTEM-1A,1B, blaOXA-9, blaCTX-M-15, blaSHV-11, cmlA1, erm(B), mph(A)]. A mutation in the porin OmpK36 was identified which is likely to be associated with the intermediate resistance to carbapenems in the absence of carbapenemase genes. U25 is one of the few K. pneumoniae strains to harbour clustered regularly interspaced short palindromic repeats (CRISPR) systems. Two CRISPR arrays corresponding to Cas3 family helicase were identified in the genome. When compared to K. pneumoniae NTUHK2044, a transposase gene InsH of IS5-13 was found inserted.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Legionella is a Gram-negative bacterium that represent a public health issue, with heavy social and economic impact. Therefore, it is mandatory to provide a proper environmental surveillance and risk assessment plan to perform Legionella control in water distribution systems in hospital and community buildings. The thesis joins several methodologies in a unique workflow applied for the identification of non-pneumophila Legionella species (n-pL), starting from standard methods as culture and gene sequencing (mip and rpoB), and passing through innovative approaches as MALDI-TOF MS technique and whole genome sequencing (WGS). The results obtained, were compared to identify the Legionella isolates, and lead to four presumptive novel Legionella species identification. One of these four new isolates was characterized and recognized at taxonomy level with the name of Legionella bononiensis (the 64th Legionella species). The workflow applied in this thesis, help to increase the knowledge of Legionella environmental species, improving the description of the environment itself and the events that promote the growth of Legionella in their ecological niche. The correct identification and characterization of the isolates permit to prevent their spread in man-made environment and contain the occurrence of cases, clusters, or outbreaks. Therefore, the experimental work undertaken, could support the preventive measures during environmental and clinical surveillance, improving the study of species often underestimated or still unknown.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: WGS is increasingly used as a first-line diagnostic test for patients with rare genetic diseases such as neurodevelopmental disorders (NDD). Clinical applications require a robust infrastructure to support processing, storage and analysis of WGS data. The identification and interpretation of SVs from WGS data also needs to be improved. Finally, there is a need for a prioritization system that enables downstream clinical analysis and facilitates data interpretation. Here, we present the results of a clinical application of WGS in a cohort of patients with NDD. Methods: We developed highly portable workflows for processing WGS data, including alignment, quality control, and variant calling of SNVs and SVs. A benchmark analysis of state-of-the-art SV detection tools was performed to select the most accurate combination for SV calling. A gene-based prioritization system was also implemented to support variant interpretation. Results: Using a benchmark analysis, we selected the most accurate combination of tools to improve SV detection from WGS data and build a dedicated pipeline. Our workflows were used to process WGS data from 77 NDD patient-parent families. The prioritization system supported downstream analysis and enabled molecular diagnosis in 32% of patients, 25% of which were SVs and suggested a potential diagnosis in 20% of patients, requiring further investigation to achieve diagnostic certainty. Conclusion: Our data suggest that the integration of SNVs and SVs is a main factor that increases diagnostic yield by WGS and show that the adoption of a dedicated pipeline improves the process of variant detection and interpretation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The artisanal food chain is enriched by a wide diversity of local food productions with delightful organoleptic characteristics and valuable nutritional properties. Despite their increasing worldwide popularity and appeal, several food safety challenges are addressed in artisanal facilities context suffering from less standardized processing conditions. In such scenario, recent advances in molecular typing and genomic surveillance (e.g., Whole Genome Sequencing [WGS]) represent an unprecedent solution capable of inferring sources of contamination as well as contributing to food safety along the artisanal food continuum. The overall objective of this PhD thesis was to explore potential microbial hazards among different artisanal food productions of animal origins (dairy and meat-derived) typical of the food culture and heritage landscape belonging to Mediterranean countries. Three different studies were then carried out, specifically focussing on: 1) compare the seasonal variability of microbiological quality and potential occurrence of microbial hazards in two batches of Italian artisanal fermented dairy and meat productions; 2) Investigate genetic relationships as well as virulome and resistome of foodborne pathogens isolated within dairy and meat-derived productions located in Italy, Spain, Portugal and Morocco; 3) investigate the population structure, virulome, resistome and mobilome of Klebsiella spp. isolates collected from study 1, including an extended range of public sequences.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Human immunodeficiency virus (HIV) takes advantage of multiple host proteins to support its own replication. The gene ZNRD1 (zinc ribbon domain-containing 1) has been identified as encoding a potential host factor that influenced disease progression in HIV-positive individuals in a genomewide association study and also significantly affected HIV replication in a large-scale in vitro short interfering RNA (siRNA) screen. Genes and polymorphisms identified by large-scale analysis need to be followed up by means of functional assays and resequencing efforts to more precisely map causal genes. METHODS: Genotyping and ZNRD1 gene resequencing for 208 HIV-positive subjects (119 who experienced long-term nonprogression [LTNP] and 89 who experienced normal disease progression) was done by either TaqMan genotyping assays or direct sequencing. Genetic association analysis was performed with the SNPassoc package and Haploview software. siRNA and short hairpin RNA (shRNA) specifically targeting ZNRD1 were used to transiently or stably down-regulate ZNRD1 expression in both lymphoid and nonlymphoid cells. Cells were infected with X4 and R5 HIV strains, and efficiency of infection was assessed by reporter gene assay or p24 assay. RESULTS: Genetic association analysis found a strong statistically significant correlation with the LTNP phenotype (single-nucleotide polymorphism rs1048412; [Formula: see text]), independently of HLA-A10 influence. siRNA-based functional analysis showed that ZNRD1 down-regulation by siRNA or shRNA impaired HIV-1 replication at the transcription level in both lymphoid and nonlymphoid cells. CONCLUSION: Genetic association analysis unequivocally identified ZNRD1 as an independent marker of LTNP to AIDS. Moreover, in vitro experiments pointed to viral transcription as the inhibited step. Thus, our data strongly suggest that ZNRD1 is a host cellular factor that influences HIV-1 replication and disease progression in HIV-positive individuals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Copy number variation (CNV) has recently gained considerable interest as a source of genetic variation likely to play a role in phenotypic diversity and evolution. Much effort has been put into the identification and mapping of regions that vary in copy number among seemingly normal individuals in humans and a number of model organisms, using bioinformatics or hybridization-based methods. These have allowed uncovering associations between copy number changes and complex diseases in whole-genome association studies, as well as identify new genomic disorders. At the genome-wide scale, however, the functional impact of CNV remains poorly studied. Here we review the current catalogs of CNVs, their association with diseases and how they link genotype and phenotype. We describe initial evidence which revealed that genes in CNV regions are expressed at lower and more variable levels than genes mapping elsewhere, and also that CNV not only affects the expression of genes varying in copy number, but also have a global influence on the transcriptome. Further studies are warranted for complete cataloguing and fine mapping of CNVs, as well as to elucidate the different mechanisms by which they influence gene expression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Large, rare copy number variants (CNVs) have been implicated in a variety of psychiatric disorders, but the role of CNVs in recurrent depression is unclear. We performed a genome-wide analysis of large, rare CNVs in 3106 cases of recurrent depression, 459 controls screened for lifetime-absence of psychiatric disorder and 5619 unscreened controls from phase 2 of the Wellcome Trust Case Control Consortium (WTCCC2). We compared the frequency of cases with CNVs against the frequency observed in each control group, analysing CNVs over the whole genome, genic, intergenic, intronic and exonic regions. We found that deletion CNVs were associated with recurrent depression, whereas duplications were not. The effect was significant when comparing cases with WTCCC2 controls (P=7.7 × 10(-6), odds ratio (OR) =1.25 (95% confidence interval (CI) 1.13-1.37)) and to screened controls (P=5.6 × 10(-4), OR=1.52 (95% CI 1.20-1.93). Further analysis showed that CNVs deleting protein coding regions were largely responsible for the association. Within an analysis of regions previously implicated in schizophrenia, we found an overall enrichment of CNVs in our cases when compared with screened controls (P=0.019). We observe an ordered increase of samples with deletion CNVs, with the lowest proportion seen in screened controls, the next highest in unscreened controls and the highest in cases. This may suggest that the absence of deletion CNVs, especially in genes, is associated with resilience to recurrent depression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To identify genetic susceptibility loci for severe diabetic retinopathy, 286 Mexican-Americans with type 2 diabetes from Starr County, Texas completed detailed physical and ophthalmologic examinations including fundus photography for diabetic retinopathy grading. 103 individuals with moderate-to-severe non-proliferative diabetic retinopathy or proliferative diabetic retinopathy were defined as cases for this study. DNA samples extracted from study subjects were genotyped using the Affymetrix GeneChip® Human Mapping 100K Set, which includes 116,204 single nucleotide polymorphisms (SNPs) across the whole genome. Single-marker allelic tests and 2- to 8-SNP sliding-window Haplotype Trend Regression implemented in HelixTreeTM were first performed with these direct genotypes to identify genes/regions contributing to the risk of severe diabetic retinopathy. An additional 1,885,781 HapMap Phase II SNPs were imputed from the direct genotypes to expand the genomic coverage for a more detailed exploration of genetic susceptibility to diabetic retinopathy. The average estimated allelic dosage and imputed genotypes with the highest posterior probabilities were subsequently analyzed for associations using logistic regression and Fisher's Exact allelic tests, respectively. To move beyond these SNP-based approaches, 104,572 directly genotyped and 333,375 well-imputed SNPs were used to construct genetic distance matrices based on 262 retinopathy candidate genes and their 112 related biological pathways. Multivariate distance matrix regression was then used to test hypotheses with genes and pathways as the units of inference in the context of susceptibility to diabetic retinopathy. This study provides a framework for genome-wide association analyses, and implicated several genes involved in the regulation of oxidative stress, inflammatory processes, histidine metabolism, and pancreatic cancer pathways associated with severe diabetic retinopathy. Many of these loci have not previously been implicated in either diabetic retinopathy or diabetes. In summary, CDC73, IL12RB2, and SULF1 had the best evidence as candidates to influence diabetic retinopathy, possibly through novel biological mechanisms related to VEGF-mediated signaling pathway or inflammatory processes. While this study uncovered some genes for diabetic retinopathy, a comprehensive picture of the genetic architecture of diabetic retinopathy has not yet been achieved. Once fully understood, the genetics and biology of diabetic retinopathy will contribute to better strategies for diagnosis, treatment and prevention of this disease.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human papillomavirus type 6 (HPV6) is the major etiological agent of anogenital warts and laryngeal papillomas and has been included in both the quadrivalent and nonavalent prophylactic HPV vaccines. This study investigated the global genomic diversity of HPV6, using 724 isolates and 190 complete genomes from six continents, and the association of HPV6 genomic variants with geographical location, anatomical site of infection/disease, and gender. Initially, a 2,800-bp E5a-E5b-L1-LCR fragment was sequenced from 492/530 (92.8%) HPV6-positive samples collected for this study. Among them, 130 exhibited at least one single nucleotide polymorphism (SNP), indel, or amino acid change in the E5a-E5b-L1-LCR fragment and were sequenced in full. A global alignment and maximum likelihood tree of 190 complete HPV6 genomes (130 fully sequenced in this study and 60 obtained from sequence repositories) revealed two variant lineages, A and B, and five B sublineages: B1, B2, B3, B4, and B5. HPV6 (sub)lineage-specific SNPs and a 960-bp representative region for whole-genome-based phylogenetic clustering within the L2 open reading frame were identified. Multivariate logistic regression analysis revealed that lineage B predominated globally. Sublineage B3 was more common in Africa and North and South America, and lineage A was more common in Asia. Sublineages B1 and B3 were associated with anogenital infections, indicating a potential lesion-specific predilection of some HPV6 sublineages. Females had higher odds for infection with sublineage B3 than males. In conclusion, a global HPV6 phylogenetic analysis revealed the existence of two variant lineages and five sublineages, showing some degree of ethnogeographic, gender, and/or disease predilection in their distribution. IMPORTANCE: This study established the largest database of globally circulating HPV6 genomic variants and contributed a total of 130 new, complete HPV6 genome sequences to available sequence repositories. Two HPV6 variant lineages and five sublineages were identified and showed some degree of association with geographical location, anatomical site of infection/disease, and/or gender. We additionally identified several HPV6 lineage- and sublineage-specific SNPs to facilitate the identification of HPV6 variants and determined a representative region within the L2 gene that is suitable for HPV6 whole-genome-based phylogenetic analysis. This study complements and significantly expands the current knowledge of HPV6 genetic diversity and forms a comprehensive basis for future epidemiological, evolutionary, functional, pathogenicity, vaccination, and molecular assay development studies.