896 resultados para Whole genome mapping


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The South African Boer goat displays a characteristic white spotting phenotype, in which the pigment is limited to the head. Exploiting the existing phenotype variation within the breed, we mapped the locus causing this white spotting phenotype to chromosome 17 by genome wide association. Subsequent whole genome sequencing identified a 1 Mb copy number variant (CNV) harboring 5 genes including EDNRA. The analysis of 358 Boer goats revealed 3 alleles with one, two, and three copies of this CNV. The copy number is correlated with the degree of white spotting in goats. We propose a hypothesis that ectopic overexpression of a mutant EDNRA scavenges EDN3 required for EDNRB signaling and normal melanocyte development and thus likely lead to an absence of melanocytes in the non-pigmented body areas of Boer goats. Our findings demonstrate the value of domestic animals as reservoir of unique mutants and for identifying a precisely defined functional CNV.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The present data set provides an Excel file in a zip archive. The file lists 334 samples of size fractionated eukaryotic plankton community with a suite of associated metadata (Database W1). Note that if most samples represented the piconano- (0.8-5 µm, 73 samples), nano- (5-20 µm, 74 samples), micro- (20-180 µm, 70 samples), and meso- (180-2000 µm, 76 samples) planktonic size fractions, some represented different organismal size-fractions: 0.2-3 µm (1 sample), 0.8-20 µm (6 samples), 0.8 µm - infinity (33 samples), and 3-20 µm (1 sample). The table contains the following fields: a unique sample sequence identifier; the sampling station identifier; the Tara Oceans sample identifier (TARA_xxxxxxxxxx); an INDSC accession number allowing to retrieve raw sequence data for the major nucleotide databases (short read archives at EBI, NCBI or DDBJ); the depth of sampling (Subsurface - SUR or Deep Chlorophyll Maximum - DCM); the targeted size range; the sequences template (either DNA or WGA/DNA if DNA extracted from the filters was Whole Genome Amplified); the latitude of the sampling event (decimal degrees); the longitude of the sampling event (decimal degrees); the time and date of the sampling event; the device used to collect the sample; the logsheet event corresponding to the sampling event ; the volume of water sampled (liters). Then follows information on the cleaning bioinformatics pipeline shown on Figure W2 of the supplementary litterature publication: the number of merged pairs present in the raw sequence file; the number of those sequences matching both primers; the number of sequences after quality-check filtering; the number of sequences after chimera removal; and finally the number of sequences after selecting only barcodes present in at least three copies in total and in at least two samples. Finally, are given for each sequence sample: the number of distinct sequences (metabarcodes); the number of OTUs; the average number of barcode per OTU; the Shannon diversity index based on barcodes for each sample (URL of W4 dataset in PANGAEA); and the Shannon diversity index based on each OTU (URL of W5 dataset in PANGAEA).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The planctomycetes are a phylum of bacteria that have a unique cell compartmentalisation and yeast-like budding cell division and peptidoglycan-less proteinaceous cell walls. We wished to further our understanding of these unique organisms at the molecular level by searching for conserved amino acid sequence motifs and domains in the proteins encoded by Rhodopirellula baltica. Using BLAST and single-linkage clustering, we have discovered several new protein domains and sequence motifs in this planctomycete. R. baltica has multiple members of the newly discovered GEFGR protein family and the ASPIC C-terminal domain family, whilst most other organisms for which whole genome sequence is available have no more than one. Many of the domains and motifs appear to be restricted to the planctomycetes. It is possible that these protein domains and motifs may have been lost or replaced in other phyla, or they may have undergone multiple duplication events in the planctomycete lineage. One of the novel motifs probably represents a novel N-terminal export signal peptide. With their unique cell biology, it may be that the planctomycete cell compartmentalisation plan in particular needs special membrane transport mechanisms. The discovery of these new domains and motifs, many of which are associated with secretion and cell-surface functions, will help to stimulate experimental work and thus enhance further understanding of this fascinating group of organisms. (C) 2004 Federation of European Microbiological Societies. Published by Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Columnar cell lesions (CCLs) of the breast are a spectrum of lesions that have posed difficulties to pathologists for many years, prompting discussion concerning their biologic and clinical significance. We present a study of CCL in context with hyperplasia of usual type (HUT) and the more advanced lesions ductal carcinoma in situ (DCIS) and invasive ductal carcinoma. A total of 81 lesions from 18 patients were subjected to a comprehensive morphologic review based upon a modified version of Schnitt's classification system for CCL, immunophenotypic analysis (estrogen receptor [ER], progesterone receptor [PgR], Her2/neu, cytokeratin 5/6 [CK5/6], cytokeratin 14 [CK14], E-cadherin, p53) and for the first time, a whole genome molecular analysis by comparative genomic hybridization. Multiple CCLs from 3 patients were studied in particular detail, with topographic information and/or showing a morphologic spectrum of CCL within individual terminal duct lobular units. CCLs were ER an PgR positive, CK5/6 and CK14 negative, exhibit low numbers of genetic alterations and recurrent 16q loss, features that are similar to those of low grade in situ and invasive carcinoma. The molecular genetic profiles closely reflect the degree of proliferation and atypia in CCL, indicating some of these lesions represent both a morphologic and molecular continuum. In addition, overlapping chromosomal alterations between CCL and more advanced lesions within individual terminal duct lobular units suggest a commonality in molecular evolution. These data further support the hypothesis that CCLs are a nonobligate, intermediary step in the development of some forms of low grade in situ and invasive carcinoma. Copyright: © 2005 Lippincott Williams & Wilkins, Inc.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Chlamydia pneumoniae is an obligate intracellular respiratory pathogen that causes 10% of community-acquired pneumonia and has been associated with cardiovascular disease. Both whole-genome sequencing and specific gene typing suggest that there is relatively little genetic variation in human isolates of C. pneumoniae. To date, there has been little genomic analysis of strains from human cardiovascular sites. The genotypes of C. pneumoniae present in human atherosclerotic carotid plaque were analysed and several polymorphisms in the variable domain 4 (VD4) region of the outer-membrane protein-A (ompA) gene and the intergenic region between the ygeD and uridine kinase (ygeD-urk) genes were found. While one genotype was identified that was the same as one reported previously in humans (respiratory and cardiovascular), another genotype was found that was identical to a genotype from non-human sources (frog/koala).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In Mesoamerica, tropical dry forest is a highly threatened habitat, and species endemic to this environment are under extreme pressure. The tree species, Lonchocarpus costaricensis is endemic to the dry northwest of Costa Rica and southwest Nicaragua. It is a locally important species but, as land has been cleared for agriculture, populations have experienced considerable reduction and fragmentation. To assess current levels and distribution of genetic diversity in the species, a combination of chloroplast-specific (cpDNA) and whole genome DNA markers (amplified fragment length polymorphism, AFLP) were used to fingerprint 121 individual trees in 6 populations. Two cpDNA haplotypes were identified, distributed among populations such that populations at the extremes of the distribution showed lowest diversity. A large number (487) of AFLP markers were obtained and indicated that diversity levels were highest in the two coastal populations (Cobano, Matapalo, H = 0.23, 0.28 respectively). Population differentiation was low overall, F-ST = 0.12, although Matapalo was strongly differentiated from all other populations (F-ST = 0.16-0.22), apart from Cobano (F., = 0.11). Spatial genetic structure was present in both datasets at different scales: cpDNA was structured at a range-wide distribution scale, whilst AFLP data revealed genetic neighbourhoods on a population scale. In general, the habitat degradation of recent times appears not to have yet impacted diversity levels in mature populations. However, although no data on seed or saplings were collected, it seems likely that reproductive mechanisms in the species will have been affected by land clearance. It is recommended that efforts should be made to conserve the extant genetic resource base and further research undertaken to investigate diversity levels in the progeny generation.

Relevância:

80.00% 80.00%

Publicador:

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Current methods to find significantly under- and over-represented gene ontology (GO) terms in a set of genes consider the genes as equally probable balls in a bag, as may be appropriate for transcripts in micro-array data. However, due to the varying length of genes and intergenic regions, that approach is inappropriate for deciding if any GO terms are correlated with a set of genomic positions. Results: We present an algorithm - GONOME - that can determine which GO terms are significantly associated with a set of genomic positions given a genome annotated with (at least) the starts and ends of genes. We show that certain GO terms may appear to be significantly associated with a set of randomly chosen positions in the human genome if gene lengths are not considered, and that these same terms have been reported as significantly over-represented in a number of recent papers. This apparent over-representation disappears when gene lengths are considered, as GONOME does. For example, we show that, when gene length is taken into account, the term development is not significantly enriched in genes associated with human CpG islands, in contradiction to a previous report. We further demonstrate the efficacy of GONOME by showing that occurrences of the proteosome-associated control element (PACE) upstream activating sequence in the S. cerevisiae genome associate significantly to appropriate GO terms. An extension of this approach yields a whole-genome motif discovery algorithm that allows identification of many other promoter sequences linked to different types of genes, including a large group of previously unknown motifs significantly associated with the terms 'translation' and 'translational elongation'. Conclusion: GONOME is an algorithm that correctly extracts over-represented GO terms from a set of genomic positions. By explicitly considering gene size, GONOME avoids a systematic bias toward GO terms linked to large genes. Inappropriate use of existing algorithms that do not take gene size into account has led to erroneous or suspect conclusions. Reciprocally GONOME may be used to identify new features in genomes that are significantly associated with particular categories of genes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Thirty-three microsatellite loci were isolated for the Australian rainforest tree Macadamia integrifolia. Genotyping across a test panel of 43 commercial cultivars generated an average polymorphic information content of 0.480. Five loci showed no polymorphism across cultivars. Significant linkage disequilibrium was detected in 10 pairwise comparisons, including two pairs of loci identified from the same clone sequence. The 33 microsatellite loci represent a significant tool for genome mapping and population genetic studies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Determination of the subcellular location of a protein is essential to understanding its biochemical function. This information can provide insight into the function of hypothetical or novel proteins. These data are difficult to obtain experimentally but have become especially important since many whole genome sequencing projects have been finished and many resulting protein sequences are still lacking detailed functional information. In order to address this paucity of data, many computational prediction methods have been developed. However, these methods have varying levels of accuracy and perform differently based on the sequences that are presented to the underlying algorithm. It is therefore useful to compare these methods and monitor their performance. Results: In order to perform a comprehensive survey of prediction methods, we selected only methods that accepted large batches of protein sequences, were publicly available, and were able to predict localization to at least nine of the major subcellular locations (nucleus, cytosol, mitochondrion, extracellular region, plasma membrane, Golgi apparatus, endoplasmic reticulum (ER), peroxisome, and lysosome). The selected methods were CELLO, MultiLoc, Proteome Analyst, pTarget and WoLF PSORT. These methods were evaluated using 3763 mouse proteins from SwissProt that represent the source of the training sets used in development of the individual methods. In addition, an independent evaluation set of 2145 mouse proteins from LOCATE with a bias towards the subcellular localization underrepresented in SwissProt was used. The sensitivity and specificity were calculated for each method and compared to a theoretical value based on what might be observed by random chance. Conclusion: No individual method had a sufficient level of sensitivity across both evaluation sets that would enable reliable application to hypothetical proteins. All methods showed lower performance on the LOCATE dataset and variable performance on individual subcellular localizations was observed. Proteins localized to the secretory pathway were the most difficult to predict, while nuclear and extracellular proteins were predicted with the highest sensitivity.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The vacuolar H(+)-ATPase (V-ATPase), a multisubunit, adenosine triphosphate (ATP)-driven proton pump, is essential for numerous cellular processes in all eukaryotes investigated so far. While structure and catalytic mechanism are similar to the evolutionarily related F-type ATPases, the V-ATPase's main function is to establish an electrochemical proton potential across membranes using ATP hydrolysis. The holoenzyme is formed by two subcomplexes, the transmembraneous V(0) and the cytoplasmic V(1) complexes. Sequencing of the whole genome of the ciliate Paramecium tetraurelia enabled the identification of virtually all the genes encoding V-ATPase subunits in this organism and the studying of the localization of the enzyme and roles in membrane trafficking and osmoregulation. Surprisingly, the number of V-ATPase genes in this free-living protozoan is strikingly higher than in any other species previously studied. Especially abundant are V(0)-a-subunits with as many as 17 encoding genes. This abundance creates the possibility of forming a large number of different V-ATPase holoenzymes by combination and has functional consequences by differential targeting to various organelles.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Representational difference analysis (RDA) has great potential for preferential amplification of unique but uncharacterised DNA sequences present in one source such as a whole genome, but absent from a related genome or other complex population of sequences. While a few examples of its successful exploitation have been published, the method has not been well dissected and robust, detailed published protocols are lacking. Here we examine the method in detail, suggest improvements and provide a protocol that has yielded key unique sequences from a pathogenic bacterial genome. © 2003 Elsevier Science B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Protein coding genes are comprised of protein-coding exons and non-protein-coding introns. The process of splicing involves removal of the introns and joining of the exons to form a mature messenger RNA, which subsequently undergoes translation into polypeptide. The spliceosome is a large, RNA/protein assembly of five small nuclear RNAs as well as over 300 proteins, which catalyzes intron removal and exon ligation. The selection of specific exons for inclusion in the mature messenger RNA is spatiotemporally regulated and results in production of an enormous diversity of polypeptides from a single gene locus. This phenomenon, known as alternative splicing, is regulated, in part, by protein splicing factors, which target the spliceosome to exon/intron boundaries. The first part of my dissertation (Chapters II and III) focuses on the discovery and characterization of the 45 kilodalton FK506 binding protein (FKBP45), which I discovered in the silk moth, Bombyx mori, as a U1 small nuclear RNA binding protein. This protein family binds the immunosuppressants FK506 and rapamycin and contains peptidyl-prolyl cis-trans isomerase activity, which converts polypeptides from cis to trans about a proline residue. This is the first time that an FKBP has been identified in the spliceosome. The second section of my dissertation (Chapters IV, V, VI and VII) is an investigation of the potential role of small nuclear RNA sequence variants in the control of splicing. I identified 46 copies of small nuclear RNAs in the 6X whole genome shotgun of the Bombyx mori p50T strain. These variants may play a role in differential binding of specific proteins that mediate alternative splicing. Along these lines, further investigation of U2 snRNA sequence variants in Bombyx mori demonstrated that some U2 snRNAs preferentially assemble into high molecular weight spliceosomal complexes over others. Expression of snRNA variants may represent another mechanism by which the cell is able to fine tune the splicing process.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

One of the hallmarks of bacterial survival is their ability to adapt rapidly to changing environmental conditions. Niche adaptation is a response to the signals received that are relayed, often to regulators that modulate gene expression. In the post-genomic era, DNA microarrays are used to study the dynamics of gene expression on a global scale. Numerous studies have used Pseudomonas aeruginosa--a Gram-negative environmental and opportunistic human pathogenic bacterium--as the model organism in whole-genome transcriptome analysis. This paper reviews the transcriptome studies that have led to immense advances in our understanding of the biology of this intractable human pathogen. Comparative analysis of 23 P. aeruginosa transcriptome studies has led to the identification of a unique set of genes that are signal specific and a core set that is differentially regulated. The 303 genes in the core set are involved in bacterial homeostasis, making them attractive therapeutic targets.