956 resultados para sequence variations
Genetic Variations and Diseases in UniProtKB/Swiss-Prot: The Ins and Outs of Expert Manual Curation.
Resumo:
During the last few years, next-generation sequencing (NGS) technologies have accelerated the detection of genetic variants resulting in the rapid discovery of new disease-associated genes. However, the wealth of variation data made available by NGS alone is not sufficient to understand the mechanisms underlying disease pathogenesis and manifestation. Multidisciplinary approaches combining sequence and clinical data with prior biological knowledge are needed to unravel the role of genetic variants in human health and disease. In this context, it is crucial that these data are linked, organized, and made readily available through reliable online resources. The Swiss-Prot section of the Universal Protein Knowledgebase (UniProtKB/Swiss-Prot) provides the scientific community with a collection of information on protein functions, interactions, biological pathways, as well as human genetic diseases and variants, all manually reviewed by experts. In this article, we present an overview of the information content of UniProtKB/Swiss-Prot to show how this knowledgebase can support researchers in the elucidation of the mechanisms leading from a molecular defect to a disease phenotype.
Resumo:
The ATP-binding cassette (ABC) family of proteins comprise a group of membrane transporters involved in the transport of a wide variety of compounds, such as xenobiotics, vitamins, lipids, amino acids, and carbohydrates. Determining their regional expression patterns along the intestinal tract will further characterize their transport functions in the gut. The mRNA expression levels of murine ABC transporters in the duodenum, jejunum, ileum, and colon were examined using the Affymetrix MuU74v2 GeneChip set. Eight ABC transporters (Abcb2, Abcb3, Abcb9, Abcc3, Abcc6, Abcd1, Abcg5, and Abcg8) displayed significant differential gene expression along the intestinal tract, as determined by two statistical models (a global error assessment model and a classic ANOVA, both with a P < 0.01). Concordance with semiquantitative real-time PCR was high. Analyzing the promoters of the differentially expressed ABC transporters did not identify common transcriptional motifs between family members or with other genes; however, the expression profile for Abcb9 was highly correlated with fibulin-1, and both genes share a common complex promoter model involving the NFkappaB, zinc binding protein factor (ZBPF), GC-box factors SP1/GC (SP1F), and early growth response factor (EGRF) transcription binding motifs. The cellular location of another of the differentially expressed ABC transporters, Abcc3, was examined by immunohistochemistry. Staining revealed that the protein is consistently expressed in the basolateral compartment of enterocytes along the anterior-posterior axis of the intestine. Furthermore, the intensity of the staining pattern is concordant with the expression profile. This agrees with previous findings in which the mRNA, protein, and transport function of Abcc3 were increased in the rat distal intestine. These data reveal regional differences in gene expression profiles along the intestinal tract and demonstrate that a complete understanding of intestinal ABC transporter function can only be achieved by examining the physiologically distinct regions of the gut.
Resumo:
MR structural T1-weighted imaging using high field systems (>3T) is severely hampered by the existing large transmit field inhomogeneities. New sequences have been developed to better cope with such nuisances. In this work we show the potential of a recently proposed sequence, the MP2RAGE, to obtain improved grey white matter contrast with respect to conventional T1-w protocols, allowing for a better visualization of thalamic nuclei and different white matter bundles in the brain stem. Furthermore, the possibility to obtain high spatial resolution (0.65 mm isotropic) R1 maps fully independent of the transmit field inhomogeneities in clinical acceptable time is demonstrated. In this high resolution R1 maps it was possible to clearly observe varying properties of cortical grey matter throughout the cortex and observe different hippocampus fields with variations of intensity that correlate with known myelin concentration variations.
Resumo:
A T(2) magnetization-preparation (T(2) Prep) sequence is proposed that is insensitive to B(1) field variations and simultaneously provides fat suppression without any further increase in specific absorption rate (SAR). Increased B(1) inhomogeneity at higher magnetic field strength (B(0) > or = 3T) necessitates a preparation sequence that is less sensitive to B(1) variations. For the proposed technique, T(2) weighting in the image is achieved using a segmented B(1)-insensitive rotation (BIR-4) adiabatic pulse by inserting two equally long delays, one after the initial reverse adiabatic half passage (AHP), and the other before the final AHP segment of a BIR-4 pulse. This sequence yields T(2) weighting with both B(1) and B(0) insensitivity. To simultaneously suppress fat signal (at the cost of B(0) insensitivity), the second delay is prolonged so that fat accumulates additional phase due to its chemical shift. Numerical simulations as well as phantom and in vivo image acquisitions were performed to show the efficacy of the proposed technique.
Resumo:
To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
Resumo:
IMPORTANCE: The association of copy number variations (CNVs), differing numbers of copies of genetic sequence at locations in the genome, with phenotypes such as intellectual disability has been almost exclusively evaluated using clinically ascertained cohorts. The contribution of these genetic variants to cognitive phenotypes in the general population remains unclear. OBJECTIVE: To investigate the clinical features conferred by CNVs associated with known syndromes in adult carriers without clinical preselection and to assess the genome-wide consequences of rare CNVs (frequency ≤0.05%; size ≥250 kilobase pairs [kb]) on carriers' educational attainment and intellectual disability prevalence in the general population. DESIGN, SETTING, AND PARTICIPANTS: The population biobank of Estonia contains 52,000 participants enrolled from 2002 through 2010. General practitioners examined participants and filled out a questionnaire of health- and lifestyle-related questions, as well as reported diagnoses. Copy number variant analysis was conducted on a random sample of 7877 individuals and genotype-phenotype associations with education and disease traits were evaluated. Our results were replicated on a high-functioning group of 993 Estonians and 3 geographically distinct populations in the United Kingdom, the United States, and Italy. MAIN OUTCOMES AND MEASURES: Phenotypes of genomic disorders in the general population, prevalence of autosomal CNVs, and association of these variants with educational attainment (from less than primary school through scientific degree) and prevalence of intellectual disability. RESULTS: Of the 7877 in the Estonian cohort, we identified 56 carriers of CNVs associated with known syndromes. Their phenotypes, including cognitive and psychiatric problems, epilepsy, neuropathies, obesity, and congenital malformations are similar to those described for carriers of identical rearrangements ascertained in clinical cohorts. A genome-wide evaluation of rare autosomal CNVs (frequency, ≤0.05%; ≥250 kb) identified 831 carriers (10.5%) of the screened general population. Eleven of 216 (5.1%) carriers of a deletion of at least 250 kb (odds ratio [OR], 3.16; 95% CI, 1.51-5.98; P = 1.5e-03) and 6 of 102 (5.9%) carriers of a duplication of at least 1 Mb (OR, 3.67; 95% CI, 1.29-8.54; P = .008) had an intellectual disability compared with 114 of 6819 (1.7%) in the Estonian cohort. The mean education attainment was 3.81 (P = 1.06e-04) among 248 (≥250 kb) deletion carriers and 3.69 (P = 5.024e-05) among 115 duplication carriers (≥1 Mb). Of the deletion carriers, 33.5% did not graduate from high school (OR, 1.48; 95% CI, 1.12-1.95; P = .005) and 39.1% of duplication carriers did not graduate high school (OR, 1.89; 95% CI, 1.27-2.8; P = 1.6e-03). Evidence for an association between rare CNVs and lower educational attainment was supported by analyses of cohorts of adults from Italy and the United States and adolescents from the United Kingdom. CONCLUSIONS AND RELEVANCE: Known pathogenic CNVs in unselected, but assumed to be healthy, adult populations may be associated with unrecognized clinical sequelae. Additionally, individually rare but collectively common intermediate-size CNVs may be negatively associated with educational attainment. Replication of these findings in additional population groups is warranted given the potential implications of this observation for genomics research, clinical care, and public health.
Resumo:
BACKGROUND: Since its first detection, characterization of R. felis has been a matter of debate, mostly due to the contamination of an initial R. felis culture by R. typhi. However, the first stable culture of R. felis allowed its precise phenotypic and genotypic characterization, and demonstrated that this species belonged to the spotted fever group rickettsiae. Later, its genome sequence revealed the presence of two forms of the same plasmid, physically confirmed by biological data. In a recent article, Gillespie et al. (PLoS One. 2007;2(3):e266.) used a bioinformatic approach to refute the presence of the second plasmid form, and proposed the creation of a specific phylogenetic group for R. felis. METHODOLOGY/PRINCIPAL FINDINGS: In the present report, we, and five independent international laboratories confirmed unambiguously by PCR the presence of two plasmid forms in R. felis strain URRWXCal(2) (T), but observed that the plasmid content of this species, from none to 2 plasmid forms, may depend on the culture passage history of the studied strain. We also demonstrated that R. felis does not cultivate in Vero cells at 37 degrees C but generates plaques at 30 degrees C. Finally, using a phylogenetic study based on 667 concatenated core genes, we demonstrated the position of R. felis within the spotted fever group. SIGNIFICANCE: We demonstrated that R. felis, which unambiguously belongs to the spotted fever group rickettsiae, may contain up to two plasmid forms but this plasmid content is unstable.
Resumo:
Apoptotic beta cell death is an underlying cause majorly for type I and to a lesser extent for type II diabetes. Recently, MST1 kinase was identified as a key apoptotic agent in diabetic condition. In this study, I have examined MST1 and closely related kinases namely, MST2, MST3 and MST4, aiming to tackle diabetes by exploring ways to selectively block MST1 kinase activity. The first investigation was directed towards evaluating possibilities of selectively blocking the ATP binding site of MST1 kinase that is essential for the activity of the enzymes. Structure and sequence analyses of this site however revealed a near absolute conservation between the MSTs and very few changes with other kinases. The observed residue variations also displayed similar physicochemical properties making it hard for selective inhibition of the enzyme. Second, possibilities for allosteric inhibition of the enzyme were evaluated. Analysis of the recognized allosteric site also posed the same problem as the MSTs shared almost all of the same residues. The third analysis was made on the SARAH domain, which is required for the dimerization and activation of MST1 and MST2 kinases. MST3 and MST4 lack this domain, hence selectivity against these two kinases can be achieved. Other proteins with SARAH domains such as the RASSF proteins were also examined. Their interaction with the MST1 SARAH domain were evaluated to mimic their binding pattern and design a peptide inhibitor that interferes with MST1 SARAH dimerization. In molecular simulations the RASSF5 SARAH domain was shown to strongly interact with the MST1 SARAH domain and possibly preventing MST1 SARAH dimerization. Based on this, the peptidic inhibitor was suggested to be based on the sequence of RASSF5 SARAH domain. Since the MST2 kinase also interacts with RASSF5 SARAH domain, absolute selectivity might not be achieved.
Resumo:
The quartzite microfabric found in the Lorrain Formation was studied across the La Cloche syncline, along a regional north-south transect along highway 6, near Whitefish Falls, Ontario. The complete stratigraphic sequence across the syncline is preserved, and is present on each fold limb. The lithostratigraphic units with the smallest grains size and lowest mica content are located close to the core of the fold, while coarser grained mica and feldspar rich units are situated at the northern and southern most extent of the transect. Deformation mechanisms vary with lithology and with position across the fold. Pressure solution appears to be the dominant deformation mechanism in the feldspathic, micaceous and ferruginous units. In the finer grained, mica poor white medium grained and cherty sandstone units, grain boundary migration (GBM) characteristics show dominance over those of pressure solution and show high amounts of fracturing which cut migrated boundaries and therefore post date GBM. All samples across the fold display a preferred orientation of quartz c-axes. The senses of asymmetry of fabrics are found to be similar across the syncline, with the exception of the ferruginous sandstone unit. Formation of these similar fabrics synmietries can not be the result of strain related to first order folding. The mica content appears to be related to the percentage of quartz lost due to pressure solution as a result of strain; the more mica present, the less quartz was lost. Calculations based on the shape of initial grains suggest that conservatively 30% of the quartz volume has been dissolved out of the Lorrain quartzite, and potentially migrated hundreds of meters to other members of the Huronian Supergroup as there was no meso or macroscopic evidence observed in outcrop.
Resumo:
Variations in different types of genomes have been found to be responsible for a large degree of physical diversity such as appearance and susceptibility to disease. Identification of genomic variations is difficult and can be facilitated through computational analysis of DNA sequences. Newly available technologies are able to sequence billions of DNA base pairs relatively quickly. These sequences can be used to identify variations within their specific genome but must be mapped to a reference sequence first. In order to align these sequences to a reference sequence, we require mapping algorithms that make use of approximate string matching and string indexing methods. To date, few mapping algorithms have been tailored to handle the massive amounts of output generated by newly available sequencing technologies. In otrder to handle this large amount of data, we modified the popular mapping software BWA to run in parallel using OpenMPI. Parallel BWA matches the efficiency of multithreaded BWA functions while providing efficient parallelism for BWA functions that do not currently support multithreading. Parallel BWA shows significant wall time speedup in comparison to multithreaded BWA on high-performance computing clusters, and will thus facilitate the analysis of genome sequencing data.
Resumo:
Genome sequence varies in numerous ways among individuals although the gross architecture is fixed for all humans. Retrotransposons create one of the most abundant structural variants in the human genome and are divided in many families, with certain members in some families, e.g., L1, Alu, SVA, and HERV-K, remaining active for transposition. Along with other types of genomic variants, retrotransponson-derived variants contribute to the whole spectrum of genome variants in humans. With the advancement of sequencing techniques, many human genomes are being sequenced at the individual level, fueling the comparative research on these variants among individuals. In this thesis, the evolution and functional impact of structural variations is examined primarily focusing on retrotransposons in the context of human evolution. The thesis comprises of three different studies on the topics that are presented in three data chapters. First, the recent evolution of all human specific AluYb members, representing the second most active subfamily of Alus, was tracked to identify their source/master copy using a novel approach. All human-specific AluYb elements from the reference genome were extracted, aligned with one another to construct clusters of similar copies and each cluster was analyzed to generate the evolutionary relationship between the members of the cluster. The approach resulted in identification of one major driver copy of all human specific Yb8 and the source copy of the Yb9 lineage. Three new subfamilies within the AluYb family – Yb8a1, Yb10 and Yb11 were also identified, with Yb11 being the youngest and most polymorphic. Second, an attempt to construct a relation between transposable elements (TEs) and tandem repeats (TRs) was made at a genome-wide scale for the first time. Upon sequence comparison, positional cross-checking and other relevant analyses, it was observed that over 20% of all TRs are derived from TEs. This result established the first connection between these two types of repetitive elements, and extends our appreciation for the impact of TEs on genomes. Furthermore, only 6% of these TE-derived TRs follow the already postulated initiation and expansion mechanisms, suggesting that the others are likely to follow a yet-unidentified mechanism. Third, by taking a combination of multiple computational approaches involving all types of genetic variations published so far including transposable elements, the first whole genome sequence of the most recent common ancestor of all modern human populations that diverged into different populations around 125,000-100,000 years ago was constructed. The study shows that the current reference genome sequence is 8.89 million base pairs larger than our common ancestor’s genome, contributed by a whole spectrum of genetic mechanisms. The use of this ancestral reference genome to facilitate the analysis of personal genomes was demonstrated using an example genome and more insightful recent evolutionary analyses involving the Neanderthal genome. The three data chapters presented in this thesis conclude that the tandem repeats and transposable elements are not two entirely distinctly isolated elements as over 20% TRs are actually derived from TEs. Certain subfamilies of TEs themselves are still evolving with the generation of newer subfamilies. The evolutionary analyses of all TEs along with other genomic variants helped to construct the genome sequence of the most recent common ancestor to all modern human populations which provides a better alternative to human reference genome and can be a useful resource for the study of personal genomics, population genetics, human and primate evolution.
Resumo:
Modern computer systems are plagued with stability and security problems: applications lose data, web servers are hacked, and systems crash under heavy load. Many of these problems or anomalies arise from rare program behavior caused by attacks or errors. A substantial percentage of the web-based attacks are due to buffer overflows. Many methods have been devised to detect and prevent anomalous situations that arise from buffer overflows. The current state-of-art of anomaly detection systems is relatively primitive and mainly depend on static code checking to take care of buffer overflow attacks. For protection, Stack Guards and I-leap Guards are also used in wide varieties.This dissertation proposes an anomaly detection system, based on frequencies of system calls in the system call trace. System call traces represented as frequency sequences are profiled using sequence sets. A sequence set is identified by the starting sequence and frequencies of specific system calls. The deviations of the current input sequence from the corresponding normal profile in the frequency pattern of system calls is computed and expressed as an anomaly score. A simple Bayesian model is used for an accurate detection.Experimental results are reported which show that frequency of system calls represented using sequence sets, captures the normal behavior of programs under normal conditions of usage. This captured behavior allows the system to detect anomalies with a low rate of false positives. Data are presented which show that Bayesian Network on frequency variations responds effectively to induced buffer overflows. It can also help administrators to detect deviations in program flow introduced due to errors.
Resumo:
In recent years, a large number of papers have reported the response of the cusp to solar wind variations under conditions of northward or southward Interplanetary Magnetic Field (IMF) Z-component (BZ). These studies have shown the importance of both temporal and spatial factors in determining the extent and morphology of the cusp and the changes in its location, connected to variations in the reconnection geometry. Here we present a comparative study of the cusp, focusing on an interval characterised by a series of rapid reversals in the BZ-dominated IMF, based on observations from space-borne and ground-based instrumentation. During this interval, from 08:00 to 12:00 UT on 12 February 2003, the IMF BZ component underwent four reversals, remaining for around 30 min in each orientation. The Cluster spacecraft were, at the time, on an outbound trajectory through the Northern Hemisphere magnetosphere, whilst the mainland VHF and Svalbard (ESR) radars of the EISCAT facility were operating in support of the Cluster mission. Both Cluster and the EISCAT were, on occasion during the interval, observing the cusp region. The series of IMF reversal resulted in a sequence of poleward and equatorward motions of the cusp; consequently Cluster crossed the high altitude cusp twice before finally exiting the dayside magnetopause, both times under conditions of northward IMF BZ. The first magnetospheric cusp encounter, by all four Cluster spacecraft, showed reverse ion dispersion typical of lobe reconnection; subsequently, Cluster spacecraft 1 and 3 (only) crossed the cusp for a second time. We suggest that, during this second cusp crossing, these two spacecraft were likely to have been on newly closed field lines, which were first reconnected (opened) at low latitudes and later reconnected again (re-closed) poleward of the northern cusp.
Resumo:
The different triplet sequences in high molecular weight aromatic copolyimides comprising pyromellitimide units ("I") flanked by either ether-ketone ("K") or ether-sulfone residues ("S") show different binding strengths for pyrene-based tweezer-molecules. Such molecules bind primarily to the diimide unit through complementary π-π-stacking and hydrogen bonding. However, as shown by the magnitudes of 1H NMR complexation shifts and tweezer-polymer binding constants, the triplet "SIS" binds tweezer-molecules more strongly than "KIS" which in turn bind such molecules more strongly than "KIK". Computational models for tweezer-polymer binding, together with single-crystal X-ray analyses of tweezer-complexes with macrocyclic ether-imides, reveal that the variations in binding strength between the different triplet sequences arise from the different conformational preferences of aromatic rings at diarylketone and diarylsulfone linkages. These preferences determine whether or not chain-folding and secondary π−π-stacking occurs between the arms of the tweezermolecule and the 4,4'-biphenylene units which flank the central diimide residue.
Resumo:
The effects of varying the alkali metal cation in the high-temperature nucleophilic synthesis of a semi-crystalline, aromatic poly(ether ketone) have been systematically investigated, and striking variations in the sequence-distributions and thermal characteristics of the resulting polymers were found. Polycondensation of 4,4'-dihydroxybenzophenone with 1,3-bis(4-fluorobenzoyl)benzene in diphenylsulfone as solvent, in the presence of an alkali metal carbonate M2CO3 (M= Li, Na, K, or Rb) as base, affords a range of different polymers that vary in the distribution pattern of 2-ring and 3-ring monomer units along the chain. Lithium carbonate gives an essentially alternating and highly crystalline polymer, but the degree of sequence-randomisation increases progressively as the alkali metal series is descended, with rubidium carbonate giving a fully random and non-thermally-crystallisable polymer. Randomisation during polycondensation is shown to result from reversible cleavage of the ether linkages in the polymer by fluoride ions, and an isolated sample of alternating-sequence polymer is thus converted to a fully randomised material on heating with rubidium fluoride.