846 resultados para Whole genome sequencing
Resumo:
A comprehensive second-generation whole genome radiation hybrid (RH II), cytogenetic and comparative map of the horse genome (2n = 64) has been developed using the 5000rad horse x hamster radiation hybrid panel and fluorescence in situ hybridization (FISH). The map contains 4,103 markers (3,816 RH; 1,144 FISH) assigned to all 31 pairs of autosomes and the X chromosome. The RH maps of individual chromosomes are anchored and oriented using 857 cytogenetic markers. The overall resolution of the map is one marker per 775 kilobase pairs (kb), which represents a more than five-fold improvement over the first-generation map. The RH II incorporates 920 markers shared jointly with the two recently reported meiotic maps. Consequently the two maps were aligned with the RH II maps of individual autosomes and the X chromosome. Additionally, a comparative map of the horse genome was generated by connecting 1,904 loci on the horse map with genome sequences available for eight diverse vertebrates to highlight regions of evolutionarily conserved syntenies, linkages, and chromosomal breakpoints. The integrated map thus obtained presents the most comprehensive information on the physical and comparative organization of the equine genome and will assist future assemblies of whole genome BAC fingerprint maps and the genome sequence. It will also serve as a tool to identify genes governing health, disease and performance traits in horses and assist us in understanding the evolution of the equine genome in relation to other species.
Resumo:
Here we discuss proteomic analyses of whole cell preparations of the mosquito stages of malaria parasite development (i.e. gametocytes, microgamete, ookinete, oocyst and sporozoite) of Plasmodium berghei. We also include critiques of the proteomes of two cell fractions from the purified ookinete, namely the micronemes and cell surface. Whereas we summarise key biological interpretations of the data, we also try to identify key methodological constraints we have met, only some of which we were able to resolve. Recognising the need to translate the potential of current genome sequencing into functional understanding, we report our efforts to develop more powerful combinations of methods for the in silico prediction of protein function and location. We have applied this analysis to the proteome of the male gamete, a cell whose very simple structural organisation facilitated interpretation of data. Some of the in silico predictions made have now been supported by ongoing protein tagging and genetic knockout studies. We hope this discussion may assist future studies.
Resumo:
A novel canine muscular dystrophy in Landseer dogs was observed. We had access to five affected dogs from two litters. The clinical signs started at a few weeks of age and the severe progressive muscle weakness led to euthanasia between 5 and 15 months of age. The pedigrees of the affected dogs suggested a monogenic autosomal recessive inheritance of the trait. Linkage and homozygosity mapping indicated two potential genome segments for the causative variant on chromosomes 10 and 31 harboring a total of 4.8 Mb of DNA or 0.2% of the canine genome. Using the illumina sequencing technology we obtained a whole genome sequence from one affected Landseer. Variants were called with respect to the dog reference genome and compared to the genetic variants of 170 control dogs from other breeds. The affected Landseer dog was homozygous for a single private non-synonymous variant in the critical intervals, a nonsense variant in the COL6A1 gene (Chr31:39,303,964G>T; COL6A1:c.289G>T; p.E97*). Genotypes at this variant showed perfect concordance with the muscular dystrophy phenotype in all five cases and more than one thousand control dogs. Variants in the human COL6A1 gene cause Bethlem myopathy or Ullrich congenital muscular dystrophy. We therefore conclude that the identified canine COL6A1 variant is most likely causative for the observed muscular dystrophy in Landseer dogs. Based on the nature of the genetic variant in Landseer dogs and their severe clinical phenotype these dogs represent a model for human Ullrich congenital muscular dystrophy.
Resumo:
Clinical, pathological and genetic examination revealed an as yet uncharacterized juvenile-onset neuroaxonal dystrophy (NAD) in Spanish water dogs. Affected dogs presented with various neurological deficits including gait abnormalities and behavioral deficits. Histopathology demonstrated spheroid formation accentuated in the grey matter of the cerebral hemispheres, the cerebellum, the brain stem and in the sensory pathways of the spinal cord. Iron accumulation was absent. Ultrastructurally spheroids contained predominantly closely packed vesicles with a double-layered membrane, which were characterized as autophagosomes using immunohistochemistry. The family history of the four affected dogs suggested an autosomal recessive inheritance. SNP genotyping showed a single genomic region of extended homozygosity of 4.5 Mb in the four cases on CFA 8. Linkage analysis revealed a maximal parametric LOD score of 2.5 at this region. By whole genome re-sequencing of one affected dog, a perfectly associated, single, non-synonymous coding variant in the canine tectonin beta-propeller repeat-containing protein 2 (TECPR2) gene affecting a highly conserved region was detected (c.4009C>T or p.R1337W). This canine NAD form displays etiologic parallels to an inherited TECPR2 associated type of human hereditary spastic paraparesis (HSP). In contrast to the canine NAD, the spinal cord lesions in most types of human HSP involve the sensory and the motor pathways. Furthermore, the canine NAD form reveals similarities to cases of human NAD defined by widespread spheroid formation without iron accumulation in the basal ganglia. Thus TECPR2 should also be considered as candidate gene for human NAD. Immunohistochemistry and the ultrastructural findings further support the assumption, that TECPR2 regulates autophagosome accumulation in the autophagic pathways. Consequently, this report provides the first genetic characterization of juvenile canine NAD, describes the histopathological features associated with the TECPR2 mutation and provides evidence to emphasize the association between failure of autophagy and neurodegeneration.
Resumo:
Alveolar echinococcosis, caused by the tapeworm Echinococcus multilocularis, is one of the most severe parasitic diseases in humans and represents one of the 17 neglected diseases prioritised by the World Health Organisation (WHO) in 2012. Considering the major medical and veterinary importance of this parasite, the phylogeny of the genus Echinococcus is of considerable importance; yet, despite numerous efforts with both mitochondrial and nuclear data, it has remained unresolved. The genus is clearly complex, and this is one of the reasons for the incomplete understanding of its taxonomy. Although taxonomic studies have recognised E. multilocularis as a separate entity from the Echinococcus granulosus complex and other members of the genus, it would be premature to draw firm conclusions about the taxonomy of the genus before the phylogeny of the whole genus is fully resolved. The recent sequencing of E. multilocularis and E. granulosus genomes opens new possibilities for performing in-depth phylogenetic analyses. In addition, whole genome data provide the possibility of inferring phylogenies based on a large number of functional genes, i.e. genes that trace the evolutionary history of adaptation in E. multilocularis and other members of the genus. Moreover, genomic data open new avenues for studying the molecular epidemiology of E. multilocularis: genotyping studies with larger panels of genetic markers allow the genetic diversity and spatial dynamics of parasites to be evaluated with greater precision. There is an urgent need for international coordination of genotyping of E. multilocularis isolates from animals and human patients. This could be fundamental for a better understanding of the transmission of alveolar echinococcosis and for designing efficient healthcare strategies.
Resumo:
In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^
Resumo:
To identify genetic susceptibility loci for severe diabetic retinopathy, 286 Mexican-Americans with type 2 diabetes from Starr County, Texas completed detailed physical and ophthalmologic examinations including fundus photography for diabetic retinopathy grading. 103 individuals with moderate-to-severe non-proliferative diabetic retinopathy or proliferative diabetic retinopathy were defined as cases for this study. DNA samples extracted from study subjects were genotyped using the Affymetrix GeneChip® Human Mapping 100K Set, which includes 116,204 single nucleotide polymorphisms (SNPs) across the whole genome. Single-marker allelic tests and 2- to 8-SNP sliding-window Haplotype Trend Regression implemented in HelixTreeTM were first performed with these direct genotypes to identify genes/regions contributing to the risk of severe diabetic retinopathy. An additional 1,885,781 HapMap Phase II SNPs were imputed from the direct genotypes to expand the genomic coverage for a more detailed exploration of genetic susceptibility to diabetic retinopathy. The average estimated allelic dosage and imputed genotypes with the highest posterior probabilities were subsequently analyzed for associations using logistic regression and Fisher's Exact allelic tests, respectively. To move beyond these SNP-based approaches, 104,572 directly genotyped and 333,375 well-imputed SNPs were used to construct genetic distance matrices based on 262 retinopathy candidate genes and their 112 related biological pathways. Multivariate distance matrix regression was then used to test hypotheses with genes and pathways as the units of inference in the context of susceptibility to diabetic retinopathy. This study provides a framework for genome-wide association analyses, and implicated several genes involved in the regulation of oxidative stress, inflammatory processes, histidine metabolism, and pancreatic cancer pathways associated with severe diabetic retinopathy. Many of these loci have not previously been implicated in either diabetic retinopathy or diabetes. In summary, CDC73, IL12RB2, and SULF1 had the best evidence as candidates to influence diabetic retinopathy, possibly through novel biological mechanisms related to VEGF-mediated signaling pathway or inflammatory processes. While this study uncovered some genes for diabetic retinopathy, a comprehensive picture of the genetic architecture of diabetic retinopathy has not yet been achieved. Once fully understood, the genetics and biology of diabetic retinopathy will contribute to better strategies for diagnosis, treatment and prevention of this disease.^
Resumo:
Whole-genome duplication approximately 108 years ago was proposed as an explanation for the many duplicated chromosomal regions in Saccharomyces cerevisiae. Here we have used computer simulations and analytic methods to estimate some parameters describing the evolution of the yeast genome after this duplication event. Computer simulation of a model in which 8% of the original genes were retained in duplicate after genome duplication, and 70–100 reciprocal translocations occurred between chromosomes, produced arrangements of duplicated chromosomal regions very similar to the map of real duplications in yeast. An analytical method produced an independent estimate of 84 map disruptions. These results imply that many smaller duplicated chromosomal regions exist in the yeast genome in addition to the 55 originally reported. We also examined the possibility of determining the original order of chromosomal blocks in the ancestral unduplicated genome, but this cannot be done without information from one or more additional species. If the genome sequence of one other species (such as Kluyveromyces lactis) were known it should be possible to identify 150–200 paired regions covering the whole yeast genome and to reconstruct approximately two-thirds of the original order of blocks of genes in yeast. Rates of interchromosome translocation in yeast and mammals appear similar despite their very different rates of homologous recombination per kilobase.
Resumo:
We have developed high-density DNA microarrays of yeast ORFs. These microarrays can monitor hybridization to ORFs for applications such as quantitative differential gene expression analysis and screening for sequence polymorphisms. Automated scripts retrieved sequence information from public databases to locate predicted ORFs and select appropriate primers for amplification. The primers were used to amplify yeast ORFs in 96-well plates, and the resulting products were arrayed using an automated micro arraying device. Arrays containing up to 2,479 yeast ORFs were printed on a single slide. The hybridization of fluorescently labeled samples to the array were detected and quantitated with a laser confocal scanning microscope. Applications of the microarrays are shown for genetic and gene expression analysis at the whole genome level.
Resumo:
The Plasmodium falciparum Genome Database (http://PlasmoDB.org) integrates sequence information, automated analyses and annotation data emerging from the P.falciparum genome sequencing consortium. To date, raw sequence coverage is available for >90% of the genome, and two chromosomes have been finished and annotated. Data in PlasmoDB are organized by chromosome (1–14), and can be accessed using a variety of tools for graphical and text-based browsing or downloaded in various file formats. The GUS (Genomics Unified Schema) implementation of PlasmoDB provides a multi-species genomic relational database, incorporating data from human and mouse, as well as P.falciparum. The relational schema uses a highly structured format to accommodate diverse data sets related to genomic sequence and gene expression. Tools have been designed to facilitate complex biological queries, including many that are specific to Plasmodium parasites and malaria as a disease. Additional projects seek to integrate genomic information with the rich data sets now becoming available for RNA transcription, protein expression, metabolic pathways, genetic and physical mapping, antigenic and population diversity, and phylogenetic relationships with other apicomplexan parasites. The overall goal of PlasmoDB is to facilitate Internet- and CD-ROM-based access to both finished and unfinished sequence information by the global malaria research community.
Resumo:
One challenge presented by large-scale genome sequencing efforts is effective display of uniform information to the scientific community. The Comprehensive Microbial Resource (CMR) contains robust annotation of all complete microbial genomes and allows for a wide variety of data retrievals. The bacterial information has been placed on the Web at http://www.tigr.org/CMR for retrieval using standard web browsing technology. Retrievals can be based on protein properties such as molecular weight or hydrophobicity, GC-content, functional role assignments and taxonomy. The CMR also has special web-based tools to allow data mining using pre-run homology searches, whole genome dot-plots, batch downloading and traversal across genomes using a variety of datatypes.
Resumo:
Previously conducted sequence analysis of Arabidopsis thaliana (ecotype Columbia-0) reported an insertion of 270-kb mtDNA into the pericentric region on the short arm of chromosome 2. DNA fiber-based fluorescence in situ hybridization analyses reveal that the mtDNA insert is 618 ± 42 kb, ≈2.3 times greater than that determined by contig assembly and sequencing analysis. Portions of the mitochondrial genome previously believed to be absent were identified within the insert. Sections of the mtDNA are repeated throughout the insert. The cytological data illustrate that DNA contig assembly by using bacterial artificial chromosomes tends to produce a minimal clone path by skipping over duplicated regions, thereby resulting in sequencing errors. We demonstrate that fiber-fluorescence in situ hybridization is a powerful technique to analyze large repetitive regions in the higher eukaryotic genomes and is a valuable complement to ongoing large genome sequencing projects.
Resumo:
Since 1991, the Rice Genome Research Program in Japan has carried out rice genomics, such as large-scale cDNA analysis, construction of a fine-scale restriction fragment length polymorphism map, and physical mapping of the rice genome with yeast artificial chromosome clones. These studies have made a great impact on research into grass genomes and made rice a model plant for other cereal crop research. Starting in 1998, the Rice Genome Research Program will step into a new stage of genomics—that of genome sequencing. This project eventually should reveal all of the genomic sequence information in the rice plant and be an indispensable aid in understanding the genomics of other grass species.
Resumo:
High resolution gene maps of the six chromosomes of Dictyostelium discoideum have been generated by a combination of physical mapping techniques. A set of yeast artificial chromosome clones has been ordered into overlapping arrays that cover >98% of the 34-magabase pair genome. Clones were grouped and ordered according to the genes they carried, as determined by hybridization analyses with DNA fragments from several hundred genes. Congruence of the gene order within each arrangement of clones with the gene order determined from whole genome restriction site mapping indicates that a high degree of confidence can be placed on the clone map. This clone-based description of the Dictyostelium chromosomes should be useful for the physical mapping and subcloning of new genes and should facilitate more detailed analyses of this genome. cost of silicon-based construction and in the efficient sample handling afforded by component integration.
Resumo:
A maioria dos casos de puberdade precoce central (PPC) em meninas permanece idiopática. A hipótese de uma causa genética vem se fortalecendo após a descoberta de alguns genes associados a este fenótipo, sobretudo aqueles implicados com o sistema kisspeptina (KISS1 e KISS1R). Entretanto, apenas casos isolados de PPC foram relacionados à mutação na kisspeptina ou em seu receptor. Até recentemente, a maioria dos estudos genéticos em PPC buscava genes candidatos selecionados com base em modelos animais, análise genética de pacientes com hipogonadismo hipogonadotrófico, ou ainda, nos estudos de associação ampla do genoma. Neste trabalho, foi utilizado o sequenciamento exômico global, uma metodologia mais moderna de sequenciamento, para identificar variantes associadas ao fenótipo de PPC. Trinta e seis indivíduos com a forma de PPC familial (19 famílias) e 213 casos aparentemente esporádicos foram inicialmente selecionados. A forma familial foi definida pela presença de mais de um membro afetado na família. DNA genômico foi extraído dos leucócitos do sangue periférico de todos os pacientes. O estudo de sequenciamento exômico global realizado pela técnica ILLUMINA, em 40 membros de 15 famílias com PPC, identificou mutações inativadoras em um único gene, MKRN3, em cinco dessas famílias. Pesquisa de mutação no MKRN3 realizada por sequenciamento direto em duas famílias adicionais (quatro pacientes) identificou duas novas variantes nesse gene. O MKRN3 é um gene de um único éxon, localizado no cromossomo 15 em uma região crítica para a síndrome de Prader Willi. O gene MKRN3 sofre imprinting materno, sendo expresso apenas pelo alelo paterno. A descoberta de mutações em pacientes com PPC familial despertou o interesse para a pesquisa de mutações nesse gene em 213 pacientes com PPC aparentemente esporádica por meio de reação em cadeia de polimerase seguida de purificação enzimática e sequenciamento automático direto (Sanger). Três novas mutações e duas já anteriormente identificadas, incluindo quatro frameshifts e uma variante missense, foram encontradas, em heterozigose, em seis meninas não relacionadas. Todas as novas variantes identificadas estavam ausentes nos bancos de dados (1000 Genomes e Exome Variant Server). O estudo de segregação familial em três dessas meninas com PPC aparentemente esporádica e mutação no MKRN3 confirmou o padrão de herança autossômica dominante com penetrância completa e transmissão exclusiva pelo alelo paterno, demonstrando que esses casos eram, na verdade, também familiares. A maioria das mutações encontradas no MKRN3 era do tipo frameshift ou nonsense, levando a stop códons prematuros e proteínas truncadas e, portanto, confirmando a associação com o fenótipo. As duas mutações missenses (p.Arg365Ser e p.Phe417Ile) identificadas estavam localizadas em regiões de dedo ou anel de zinco, importantes para a função da proteína. Além disso, os estudos in silico dessas duas variantes demonstraram patogenicidade. Todos os pacientes com mutação no MKRN3 apresentavam características clínicas e hormonais típicas de ativação prematura do eixo reprodutivo. A mediana de idade de início da puberdade foi de 6 anos nas meninas (variando de 3 a 6,5) e 8 anos nos meninos (variando de 5,9 a 8,5). Tendo em vista o fenômeno de imprinting, análise de metilação foi também realizada em um subgrupo de 52 pacientes com PPC pela técnica de MS-MLPA, mas não foram encontradas alterações no padrão de metilação. Em conclusão, este trabalho identificou um novo gene associado ao fenótipo de PPC. Atualmente, mutações inativadoras no MKRN3 representam a causa genética mais comum de PPC familial (33%). O MKRN3 é o primeiro gene imprintado associado a distúrbios puberais em humanos. O mecanismo preciso de ação desse gene na regulação da secreção de GnRH necessita de estudos adicionais