990 resultados para PROTEIN FAMILIES
Resumo:
The Schizosaccharomyces pombe Mei2 gene encodes an RNA recognition motif (RRM) protein that stimulates meiosis upon binding a specific non-coding RNA and subsequent accumulation in a “mei2-dot” in the nucleus. We present here the first systematic characterization of the family of proteins with characteristic Mei2-like amino acid sequences. Mei2-like proteins are an ancient eukaryotic protein family with three identifiable RRMs. The C-terminal RRM (RRM3) is unique to Mei2-like proteins and is the most highly conserved of the three RRMs. RRM3 also contains conserved sequence elements at its C-terminus not found in other RRM domains. Single copy Mei2-like genes are present in some fungi, in alveolates such as Paramecium and in the early branching eukaryote Entamoeba histolytica, while plants contain small families of Mei2-like genes. While the C-terminal RRM is highly conserved between plants and fungi, indicating conservation of molecular mechanisms, plant Mei2-like genes have changed biological context to regulate various aspects of developmental pattern formation.
Resumo:
Intrinsically disordered proteins (IDPs) are a relatively recently defined class of proteins which, under native conditions, lack a unique tertiary structure whilst maintaining essential biological functions. Functional classification of IDPs have implicated such proteins as being involved in various physiological processes including transcription and translation regulation, signal transduction and protein modification. Actinidia DRM1 (Ade DORMANCY ASSOCIATED GENE 1), represents a robust dormancy marker whose mRNA transcript expression exhibits a strong inverse correlation with the onset of growth following periods of physiological dormancy. Bioinformatic analyses suggest that DRM1 is plant specific and highly conserved at both the nucleotide and protein levels. It is predicted to be an intrinsically disordered protein with two distinct highly conserved domains. Several Actinidia DRM1 homologues, which align into two distinct Actinidia-specific families, Type I and Type II, have been identified. No candidates for the Arabidopsis DRM1-Homologue (AtDRM2) an additional family member, has been identified in Actinidia.
Resumo:
Both the integrin and insulin-like growth factor binding protein (IGFBP) families independently play important roles in modulating tumor cell growth and progression. We present evidence for a specific cell surface localization and a bimolecular interaction between the αvβ3 integrin and IGFBP-2. The interaction, which could be specifically perturbed using vitronectin and αvβ3 blocking antibodies, was shown to modulate IGF-mediated cellular migration responses. Moreover, this interaction was observed in vivo and correlated with reduced tumor size of the human breast cancer cells, MCF-7β3, which overexpressed the αvβ3 integrin. Collectively, these results indicate that αvβ3 and IGFBP-2 act cooperatively in a negative regulatory manner to reduce tumor growth and the migratory potential of breast cancer cells.
Resumo:
The practice of medicine has always aimed at individualized treatment of disease. The relationship between patient and physician has always been a personal one, and the physician's choice of treatment has been intended to be the best fit for the patient's needs. The necessary pooling/grouping of disease families and their assignment to a number of drugs or treatment methods has, consequently, led to an increase in the number of effective therapies. However, given the heterogeneity of most human diseases, and cancer specifically, it is currently impossible for the treating clinician to effectively predict a patient's response and outcome based on current technologies, much less the idiosyncratic resistances and adverse effects associated with the limited therapeutic options.
Resumo:
Determination of sequence similarity is a central issue in computational biology, a problem addressed primarily through BLAST, an alignment based heuristic which has underpinned much of the analysis and annotation of the genomic era. Despite their success, alignment-based approaches scale poorly with increasing data set size, and are not robust under structural sequence rearrangements. Successive waves of innovation in sequencing technologies – so-called Next Generation Sequencing (NGS) approaches – have led to an explosion in data availability, challenging existing methods and motivating novel approaches to sequence representation and similarity scoring, including adaptation of existing methods from other domains such as information retrieval. In this work, we investigate locality-sensitive hashing of sequences through binary document signatures, applying the method to a bacterial protein classification task. Here, the goal is to predict the gene family to which a given query protein belongs. Experiments carried out on a pair of small but biologically realistic datasets (the full protein repertoires of families of Chlamydia and Staphylococcus aureus genomes respectively) show that a measure of similarity obtained by locality sensitive hashing gives highly accurate results while offering a number of avenues which will lead to substantial performance improvements over BLAST..
Resumo:
Familial articular chondrocalcinosis (CC) was Wrst reported in 1963. It is characterised by multiple calciWcations of hyaline and Wbrous cartilage in the joints and intervertebral discs. Mutations in ANKH have been identified in several pedigrees as a monogenic cause for this disorder. ANKH is a key protein in pyrophosphate metabolism and is involved in pyrophosphate transport across the cell membrane. The objective of this work was to screen ANKH and ENPP1, two key genes in pyrophosphate metabolism, in Slovakian kindreds with familial CC. DNA samples from 25 individuals (10 aVected, 15 unaVected) from 8 families were obtained. The promoter, coding regions and intron-exon boundaries of ANKH and ENPP1 were sequenced. Twelve DNA sequence variants, six in each gene, were identiWed. All the variants had been previously identified. None segregated with the disease. Our results suggest that neither ANKH nor ENPP1 mutations are the cause of CC in these families, indicating that possibly other major genes are involved in the aethiopathogenesis of this condition in these families.
Resumo:
Primary microcephaly (MCPH) is an autosomal-recessive congenital disorder characterized by smaller-than-normal brain size and mental retardation. MCPH is genetically heterogeneous with six known loci: MCPH1-MCPH6. We report mapping of a novel locus, MCPH7, to chromosome 1p32.3-p33 between markers D1S2797 and D1S417, corresponding to a physical distance of 8.39 Mb. Heterogeneity analysis of 24 families previously excluded from linkage to the six known MCPH loci suggested linkage of five families (20.83%) to the MCPH7 locus. In addition, four families were excluded from linkage to the MCPH7 locus as well as all of the six previously known loci, whereas the remaining 15 families could not be conclusively excluded or included. The combined maximum two-point LOD score for the linked families was 5.96 at marker D1S386 at theta = 0.0. The combined multipoint LOD score was 6.97 between markers D1S2797 and D1S417. Previously, mutations in four genes, MCPH1, CDK5RAP2, ASPM, and CENPJ, that code for centrosomal proteins have been shown to cause this disorder. Three different homozygous mutations in STIL, which codes for a pericentriolar and centrosomal protein, were identified in patients from three of the five families linked to the MCPH7 locus; all are predicted to truncate the STIL protein. Further, another recently ascertained family was homozygous for the same mutation as one of the original families. There was no evidence for a common haplotype. These results suggest that the centrosome and its associated structures are important in the control of neurogenesis in the developing human brain.
Resumo:
BACKGROUND: The ATM gene encoding a putative protein kinase is mutated in ataxia-telangiectasia (A-T), an autosomal recessive disorder with a predisposition for cancer. Studies of A-T families suggest that female heterozygotes have an increased risk of breast cancer compared with noncarriers. However, neither linkage analyses nor mutation studies have provided supporting evidence for a role of ATM in breast cancer predisposition. Nevertheless, two recurrent ATM mutations, T7271G and IVS10-6T-->G, reportedly increase the risk of breast cancer. We examined these two ATM mutations in a population-based, case-control series of breast cancer families and multiple-case breast cancer families. METHODS: Five hundred twenty-five or 262 case patients with breast cancer and 381 or 68 control subjects, respectively, were genotyped for the T7271G and IVS10-6T-->G ATM mutations, as were index patients from 76 non-BRCA1/2 multiple-case breast cancer families. Linkage and penetrance were analyzed. ATM protein expression and kinase activity were analyzed in lymphoblastoid cell lines from mutation carriers. All statistical tests were two-sided. RESULTS: In case and control subjects unselected for family history of breast cancer, one case patient had the T7271G mutation, and none had the IVS10-6T-->G mutation. In three multiple-case families, one of these two mutations segregated with breast cancer. The estimated average penetrance of the mutations was 60% (95% confidence interval [CI] = 32% to 90%) to age 70 years, equivalent to a 15.7-fold (95% CI = 6.4-fold to 38.0-fold) increased relative risk compared with that of the general population. Expression and activity analyses of ATM in heterozygous cell lines indicated that both mutations are dominant negative. CONCLUSION: At least two ATM mutations are associated with a sufficiently high risk of breast cancer to be found in multiple-case breast cancer families. Full mutation analysis of the ATM gene in such families could help clarify the role of ATM in breast cancer susceptibility.
Resumo:
The striated muscle sarcomere is a force generating and transducing unit as well as an important sensor of extracellular cues and a coordinator of cellular signals. The borders of individual sarcomeres are formed by the Z-disks. The Z-disk component myotilin interacts with Z-disk core structural proteins and with regulators of signaling cascades. Missense mutations in the gene encoding myotilin cause dominantly inherited muscle disorders, myotilinopathies, by an unknown mechanism. In this thesis the functions of myotilin were further characterized to clarify the molecular biological basis and the pathogenetic mechanisms of inherited muscle disorders, mainly caused by mutated myotilin. Myotilin has an important function in the assembly and maintenance of the Z-disks probably through its actin-organizing properties. Our results show that the Ig-domains of myotilin are needed for both binding and bundling actin and define the Ig domains as actin-binding modules. The disease-causing mutations appear not to change the interplay between actin and myotilin. Interactions between Z-disk proteins regulate muscle functions and disruption of these interactions results in muscle disorders. Mutations in Z-disk components myotilin, ZASP/Cypher and FATZ-2 (calsarcin-1/myozenin-2) are associated with myopathies. We showed that proteins from the myotilin and FATZ families interact via a novel and unique type of class III PDZ binding motif with the PDZ domains of ZASP and other Enigma family members and that the interactions can be modulated by phosphorylation. The morphological findings typical of myotilinopathies include Z-disk alterations and aggregation of dense filamentous material. The causes and mechanisms of protein aggregation in myotilinopathy patients are unknown, but impaired degradation might explain in part the abnormal protein accumulation. We showed that myotilin is degraded by the calcium-dependent, non-lysosomal cysteine protease calpain and by the proteasome pathway, and that wild type and mutant myotilin differ in their sensitivity to degradation. These studies identify the first functional difference between mutated and wild type myotilin. Furthermore, if degradation of myotilin is disturbed, it accumulates in cells in a manner resembling that seen in myotilinopathy patients. Based on the results, we propose a model where mutant myotilin escapes proteolytic breakdown and forms protein aggregates, leading to disruption of myofibrils and muscular dystrophy. In conclusion, the main results of this study demonstrate that myotilin is a Z-disk structural protein interacting with several Z-disk components. The turnover of myotilin is regulated by calpain and the ubiquitin proteasome system and mutations in myotilin seem to affect the degradation of myotilin, leading to protein accumulations in cells. These findings are important for understanding myotilin-linked muscle diseases and designing treatments for these disorders.
Resumo:
Some leucine-rich repeat (LRR) -containing membrane proteins are known regulators of neuronal growth and synapse formation. In this work I characterize two gene families encoding neuronal LRR membrane proteins, namely the LRRTM (leucine-rich repeat, transmembrane neuronal) and NGR (Nogo-66 receptor) families. I studied LRRTM and NGR family member's mRNA tissue distribution by RT-PCR and by in situ hybridization. Subcellular localization of LRRTM1 protein was studied in neurons and in non-neuronal cells. I discovered that LRRTM and NGR family mRNAs are predominantly expressed in the nervous system, and that each gene possesses a specific expression pattern. I also established that LRRTM and NGR family mRNAs are expressed by neurons, and not by glial cells. Within neurons, LRRTM1 protein is not transported to the plasma membrane; rather it localizes to endoplasmic reticulum. Nogo-A (RTN4), MAG, and OMgp are myelin-associated proteins that bind to NgR1 to limit axonal regeneration after central nervous system injury. To better understand the functions of NgR2 and NgR3, and to explore the possible redundancy in the signaling of myelin inhibitors of neurite growth, I mapped the interactions between NgR family and the known and candidate NgR1 ligands. I identified high-affinity interactions between RTN2-66, RTN3-66 and NgR1. I also demonstrate that Rtn3 mRNA is expressed in the same glial cell population of mouse spinal cord white matter as Nogo-A mRNA, and thus it could have a role in myelin inhibition of axonal growth. To understand how NgR1 interacts with multiple structurally divergent ligands, I aimed first to map in more detail the nature of Nogo-A:NgR1 interactions, and then to systematically map the binding sites of multiple myelin ligands in NgR1 by using a library of NgR1 expression constructs encoding proteins with one or multiple surface residues mutated to alanine. My analysis of the Nogo-A:NgR1 -interactions revealed a novel interaction site between the proteins, suggesting a trivalent Nogo-A:NgR1-interaction. Our analysis also defined a central binding region on the concave side of NgR1's LRR domain that is required for the binding of all known ligands, and a surrounding region critical for binding MAG and OMgp. To better understand the biological role of LRRTMs, I generated Lrrtm1 and Lrrtm3 knock out mice. I show here that reporter genes expressed from the targeted loci can be used for maping the neuronal connections of Lrrtm1 and Lrrtm3 expressing neurons in finer detail. With regard to LRRTM1's role in humans, we found a strong association between a 70 kb-spanning haplotype in the proposed promoter region of LRRTM1 gene and two possibly related phenotypes: left-handedness and schizophrenia. Interestingly, the responsible haplotype was linked to phenotypic variability only when paternally inherited. In summary, I identified two families of neuronal receptor-like proteins, and mapped their expression and certain protein-protein interactions. The identification of a central binding region in NgR1 shared by multiple ligands may facilitate the design and development of small molecule therapeutics blocking binding of all NgR1 ligands. Additionally, the genetic association data suggests that allelic variation upstream of LRRTM1 may play a role in the development of left-right brain asymmetry in humans. Lrrtm1 and Lrrtm3 knock out mice developed as a part of this study will likely be useful for schizophrenia and Alzheimer s disease research.
Genome-wide analysis and experimentation of plant serine/threonine/tyrosine-specific protein kinases
Resumo:
Protein tyrosine phosphorylation plays an important role in cell growth, development and oncogenesis. No classical protein tyrosine kinase has hitherto been cloned from plants. Does protein tyrosine kinase exist in plants? To address this, we have performed a genomic survey of protein tyrosine kinase motifs in plants using the delineated tyrosine phosphorylation motifs from the animal system. The Arabidopsis thaliana genome encodes 57 different protein kinases that have tyrosine kinase motifs. Animal non-receptor tyrosine kinases, SRC, ABL, LYN, FES, SEK, KIN and RAS have structural relationship with putative plant tyrosine kinases. In an extended analysis, animal receptor and non-receptor kinases, Raf and Ras kinases, mixed lineage kinases and plant serine/threonine/tyrosine (STY) protein kinases, form a well-supported group sharing a common origin within the superfamily of STY kinases. We report that plants lack bona fide tyrosine kinases, which raise an intriguing possibility that tyrosine phosphorylation is carried out by dual-specificity STY protein kinases in plants. The distribution pattern of STY protein kinase families on Arabidopsis chromosomes indicates that this gene family is partly a consequence of duplication and reshuffling of the Arabidopsis genome and of the generation of tandem repeats. Genome-wide analysis is supported by the functional expression and characterization of At2g24360 and phosphoproteomics of Arabidopsis. Evidence for tyrosine phosphorylated proteins is provided by alkaline hydrolysis, anti-phosphotyrosine immunoblotting, phosphoamino acid analysis and peptide mass fingerprinting. These results report the first comprehensive survey of genome-wide and tyrosine phosphoproteome analysis of plant STY protein kinases.
Resumo:
Background: Protein phosphorylation is a generic way to regulate signal transduction pathways in all kingdoms of life. In many organisms, it is achieved by the large family of Ser/Thr/Tyr protein kinases which are traditionally classified into groups and subfamilies on the basis of the amino acid sequence of their catalytic domains. Many protein kinases are multidomain in nature but the diversity of the accessory domains and their organization are usually not taken into account while classifying kinases into groups or subfamilies. Methodology: Here, we present an approach which considers amino acid sequences of complete gene products, in order to suggest refinements in sets of pre-classified sequences. The strategy is based on alignment-free similarity scores and iterative Area Under the Curve (AUC) computation. Similarity scores are computed by detecting common patterns between two sequences and scoring them using a substitution matrix, with a consistent normalization scheme. This allows us to handle full-length sequences, and implicitly takes into account domain diversity and domain shuffling. We quantitatively validate our approach on a subset of 212 human protein kinases. We then employ it on the complete repertoire of human protein kinases and suggest few qualitative refinements in the subfamily assignment stored in the KinG database, which is based on catalytic domains only. Based on our new measure, we delineate 37 cases of potential hybrid kinases: sequences for which classical classification based entirely on catalytic domains is inconsistent with the full-length similarity scores computed here, which implicitly consider multi-domain nature and regions outside the catalytic kinase domain. We also provide some examples of hybrid kinases of the protozoan parasite Entamoeba histolytica. Conclusions: The implicit consideration of multi-domain architectures is a valuable inclusion to complement other classification schemes. The proposed algorithm may also be employed to classify other families of enzymes with multidomain architecture.
Resumo:
Most of the predisposition to hereditary breast and ovarian cancer has been attributed to inherited defects in two tumor suppressor genes BRCA1 and BRCA2. To explore the contribution of BRCA1 mutations to hereditary breast cancer among Indian women, we examined the coding sequence of the BRCA1 gene in 14 breast cancer patients with a positive family history of breast and/or ovarian cancer. Mutation analysis was carried out using conformation sensitive gel electrophoresis (CSGE) followed by sequencing. Three mutations (21%) in the BRCA1 gene were identified. Two of them are novel mutations of which one is a missense mutation in exon 7 near the RING finger domain, while the other is a one base pair deletion in exon 11 which results in protein truncation. The third mutation, 185delAG, has been previously described in Ashkenazi Jewish families. To our knowledge this is the first report of a study of germline BRCA1 mutation analysis in familial breast cancer in India. Our data from 14 different families suggests a lower prevalence but definite involvement of germline mutations in the BRCA1 gene among Indian women with breast cancer and a family history of breast cancer.
Resumo:
Over the past two decades, many ingenious efforts have been made in protein remote homology detection. Because homologous proteins often diversify extensively in sequence, it is challenging to demonstrate such relatedness through entirely sequence-driven searches. Here, we describe a computational method for the generation of `protein-like' sequences that serves to bridge gaps in protein sequence space. Sequence profile information, as embodied in a position-specific scoring matrix of multiply aligned sequences of bona fide family members, serves as the starting point in this algorithm. The observed amino acid propensity and the selection of a random number dictate the selection of a residue for each position in the sequence. In a systematic manner, and by applying a `roulette-wheel' selection approach at each position, we generate parent family-like sequences and thus facilitate an enlargement of sequence space around the family. When generated for a large number of families, we demonstrate that they expand the utility of natural intermediately related sequences in linking distant proteins. In 91% of the assessed examples, inclusion of designed sequences improved fold coverage by 5-10% over searches made in their absence. Furthermore, with several examples from proteins adopting folds such as TIM, globin, lipocalin and others, we demonstrate that the success of including designed sequences in a database positively sensitized methods such as PSI-BLAST and Cascade PSI-BLAST and is a promising opportunity for enormously improved remote homology recognition using sequence information alone.
Resumo:
Genomic data of several organisms have revealed the presence of a vast repertoire of multi-domain proteins. The role played by individual domains in a multi-domain protein has a profound influence on the overall function of the protein. In the present analysis an attempt has been made to better understand the tethering preferences of domain families that occur in multi-domain proteins. The analysis has been carried out on an exhaustive dataset of 2 961 898 sequences of proteins from 930 organisms, where 741 274 proteins are comprised of at least two domain families. For every domain family, the number of other domain families with which it co-occurs within a protein in this dataset has been enumerated and is referred to as the tethering number of the domain family. It was found that, in the general dataset, the AAA ATPase family and the family of Ser/Thr kinases have the highest tethering numbers of 450 and 444 respectively. Further analysis reveals significant correlation between the number of members in a family and its tethering number. Positive correlation was also observed for the extent of a sequence and functional diversity within a family and the tethering numbers of domain families. Domain families that are present ubiquitously in diverse organisms tend to have large tethering numbers, while organism/kingdom-specific families have low tethering numbers. Thus, the analysis uncovers how domain families recombine and evolve to give rise to multi-domain proteins.