112 resultados para Amino acid, dissolved
Resumo:
Genomic and proteomic analyses have attracted a great deal of interests in biological research in recent years. Many methods have been applied to discover useful information contained in the enormous databases of genomic sequences and amino acid sequences. The results of these investigations inspire further research in biological fields in return. These biological sequences, which may be considered as multiscale sequences, have some specific features which need further efforts to characterise using more refined methods. This project aims to study some of these biological challenges with multiscale analysis methods and stochastic modelling approach. The first part of the thesis aims to cluster some unknown proteins, and classify their families as well as their structural classes. A development in proteomic analysis is concerned with the determination of protein functions. The first step in this development is to classify proteins and predict their families. This motives us to study some unknown proteins from specific families, and to cluster them into families and structural classes. We select a large number of proteins from the same families or superfamilies, and link them to simulate some unknown large proteins from these families. We use multifractal analysis and the wavelet method to capture the characteristics of these linked proteins. The simulation results show that the method is valid for the classification of large proteins. The second part of the thesis aims to explore the relationship of proteins based on a layered comparison with their components. Many methods are based on homology of proteins because the resemblance at the protein sequence level normally indicates the similarity of functions and structures. However, some proteins may have similar functions with low sequential identity. We consider protein sequences at detail level to investigate the problem of comparison of proteins. The comparison is based on the empirical mode decomposition (EMD), and protein sequences are detected with the intrinsic mode functions. A measure of similarity is introduced with a new cross-correlation formula. The similarity results show that the EMD is useful for detection of functional relationships of proteins. The third part of the thesis aims to investigate the transcriptional regulatory network of yeast cell cycle via stochastic differential equations. As the investigation of genome-wide gene expressions has become a focus in genomic analysis, researchers have tried to understand the mechanisms of the yeast genome for many years. How cells control gene expressions still needs further investigation. We use a stochastic differential equation to model the expression profile of a target gene. We modify the model with a Gaussian membership function. For each target gene, a transcriptional rate is obtained, and the estimated transcriptional rate is also calculated with the information from five possible transcriptional regulators. Some regulators of these target genes are verified with the related references. With these results, we construct a transcriptional regulatory network for the genes from the yeast Saccharomyces cerevisiae. The construction of transcriptional regulatory network is useful for detecting more mechanisms of the yeast cell cycle.
Resumo:
The ghrelin axis consists of the gene products of the ghrelin gene (GHRL), and their receptors, including the classical ghrelin receptor GHSR. While it is well-known that the ghrelin gene encodes the 28 amino acid ghrelin peptide hormone, it is now also clear that the locus encodes a range of other bioactive molecules, including novel peptides and non-coding RNAs. For many of these molecules, the physiological functions and cognate receptor(s) remain to be determined. Emerging research techniques, including proteogenomics, are likely to reveal further ghrelin axis-derived molecules. Studies of the role of ghrelin axis genes, peptides and receptors, therefore, promises to be a fruitful area of basic and clinical research in years to come.
Resumo:
Obestatin is a 23 amino acid, ghrelin gene-derived peptide hormone produced in the stomach and a range of other tissues throughout the body. While it was initially reported that obestatin opposed the actions of ghrelin with regards to appetite and food intake, it is now clear that obestatin is not an endogenous ghrelin antagonist of ghrelin, but it is a multi-functional peptide hormone in its own right. In this review we will discuss the controversies associated with the discovery of obestatin and explore emerging central and peripheral roles of obestatin, roles in adipogenesis, pancreatic homeostasis and cancer.
Resumo:
Human hair fibres are ubiquitous in nature and are found frequently at crime scenes often as a result of exchange between the perpetrator, victim and/or the surroundings according to Locard's Principle. Therefore, hair fibre evidence can provide important information for crime investigation. For human hair evidence, the current forensic methods of analysis rely on comparisons of either hair morphology by microscopic examination or nuclear and mitochondrial DNA analyses. Unfortunately in some instances the utilisation of microscopy and DNA analyses are difficult and often not feasible. This dissertation is arguably the first comprehensive investigation aimed to compare, classify and identify the single human scalp hair fibres with the aid of FTIR-ATR spectroscopy in a forensic context. Spectra were collected from the hair of 66 subjects of Asian, Caucasian and African (i.e. African-type). The fibres ranged from untreated to variously mildly and heavily cosmetically treated hairs. The collected spectra reflected the physical and chemical nature of a hair from the near-surface particularly, the cuticle layer. In total, 550 spectra were acquired and processed to construct a relatively large database. To assist with the interpretation of the complex spectra from various types of human hair, Derivative Spectroscopy and Chemometric methods such as Principal Component Analysis (PCA), Fuzzy Clustering (FC) and Multi-Criteria Decision Making (MCDM) program; Preference Ranking Organisation Method for Enrichment Evaluation (PROMETHEE) and Geometrical Analysis for Interactive Aid (GAIA); were utilised. FTIR-ATR spectroscopy had two important advantages over to previous methods: (i) sample throughput and spectral collection were significantly improved (no physical flattening or microscope manipulations), and (ii) given the recent advances in FTIR-ATR instrument portability, there is real potential to transfer this work.s findings seamlessly to on-field applications. The "raw" spectra, spectral subtractions and second derivative spectra were compared to demonstrate the subtle differences in human hair. SEM images were used as corroborative evidence to demonstrate the surface topography of hair. It indicated that the condition of the cuticle surface could be of three types: untreated, mildly treated and treated hair. Extensive studies of potential spectral band regions responsible for matching and discrimination of various types of hair samples suggested the 1690-1500 cm-1 IR spectral region was to be preferred in comparison with the commonly used 1750-800 cm-1. The principal reason was the presence of the highly variable spectral profiles of cystine oxidation products (1200-1000 cm-1), which contributed significantly to spectral scatter and hence, poor hair sample matching. In the preferred 1690-1500 cm-1 region, conformational changes in the keratin protein attributed to the α-helical to β-sheet transitions in the Amide I and Amide II vibrations and played a significant role in matching and discrimination of the spectra and hence, the hair fibre samples. For gender comparison, the Amide II band is significant for differentiation. The results illustrated that the male hair spectra exhibit a more intense β-sheet vibration in the Amide II band at approximately 1511 cm-1 whilst the female hair spectra displayed more intense α-helical vibration at 1520-1515cm-1. In terms of chemical composition, female hair spectra exhibit greater intensity of the amino acid tryptophan (1554 cm-1), aspartic and glutamic acid (1577 cm-1). It was also observed that for the separation of samples based on racial differences, untreated Caucasian hair was discriminated from Asian hair as a result of having higher levels of the amino acid cystine and cysteic acid. However, when mildly or chemically treated, Asian and Caucasian hair fibres are similar, whereas African-type hair fibres are different. In terms of the investigation's novel contribution to the field of forensic science, it has allowed for the development of a novel, multifaceted, methodical protocol where previously none had existed. The protocol is a systematic method to rapidly investigate unknown or questioned single human hair FTIR-ATR spectra from different genders and racial origin, including fibres of different cosmetic treatments. Unknown or questioned spectra are first separated on the basis of chemical treatment i.e. untreated, mildly treated or chemically treated, genders, and racial origin i.e. Asian, Caucasian and African-type. The methodology has the potential to complement the current forensic analysis methods of fibre evidence (i.e. Microscopy and DNA), providing information on the morphological, genetic and structural levels.
Resumo:
Background Chlamydia pneumoniae is a widespread pathogen causing upper and lower respiratory tract infections in addition to a range of other diseases in humans and animals. Previous whole genome analyses have focused on four essentially clonal (> 99% identity) C. pneumoniae human genomes (AR39, CWL029, J138 and TW183), providing relatively little insight into strain diversity and evolution of this species. Results We performed individual gene-by-gene comparisons of the recently sequenced C. pneumoniae koala genome and four C. pneumoniae human genomes to identify species-specific genes, and more importantly, to gain an insight into the genetic diversity and evolution of the species. We selected genes dispersed throughout the chromosome, representing genes that were specific to C. pneumoniae, genes with a demonstrated role in chlamydial biology and/or pathogenicity (n = 49), genes encoding nucleotide salvage or amino acid biosynthesis proteins (n = 6), and extrachromosomal elements (9 plasmid and 2 bacteriophage genes). Conclusions We have identified strain-specific differences and targets for detection of C. pneumoniae isolates from both human and animal origin. Such characterisation is necessary for an improved understanding of disease transmission and intervention.
Resumo:
Recently, a polymorphism was identified in exon 25 of the factor V gene that is possibly a functional candidate for the HR2 haplotype. This haplotype is characterized by a single base substitution named R2 (A4070G) in the B domain of the protein. A mutation (A6755G; 2194Asp→Gly) located near the C terminus has been hypothesized to influence protein folding and glycosylation, and might be responsible for the shift in factor V isoform (FV1 / FV2) ratio. This study investigated the prevalence of these two factor V HR2 haplotype polymorphisms in a cohort of normal blood donors, patients with osteoarthritis and women with complications during pregnancy, and in families of factor V Leiden individuals. A high allele frequency for the two polymorphisms was found in the blood donor group (6.2% R2, 5.6% A6755G). No significant difference in allele frequency was observed in the clinical groups (obstetric complications and osteoarthritis, 4.1-4.9% for the two polymorphisms) when compared with that of healthy blood donors. We confirm that the factor V A6755G polymorphism shows strong linkage to the R2 allele, although it is not exclusively inherited with the exon 13 A4070G variant and can occur independently. © 2001 Lippincott Williams & Wilkins.
Resumo:
PCR-based cancer diagnosis requires detection of rare mutations in k- ras, p53 or other genes. The assumption has been that mutant and wild-type sequences amplify with near equal efficiency, so that they are eventually present in proportions representative of the starting material. Work on factor IX suggests that this assumption is invalid for one case of near- sequence identity. To test the generality of this phenomenon and its relevance to cancer diagnosis, primers distant from point mutations in p53 and k-ras were used to amplify wild-type and mutant sequences from these genes. A substantial bias against PCR amplification of mutants was observed for two regions of the p53 gene and one region of k-ras. For k-ras and p53, bias was observed when the wild-type and mutant sequences were amplified separately or when mixed in equal proportions before PCR. Bias was present with proofreading and non-proofreading polymerase. Mutant and wild-type segments of the factor V, cystic fibrosis transmembrane conductance regulator and prothrombin genes were amplified and did not exhibit PCR bias. Therefore, the assumption of equal PCR efficiency for point mutant and wild-type sequences is invalid in several systems. Quantitative or diagnostic PCR will require validation for each locus, and enrichment strategies may be needed to optimize detection of mutants.
Resumo:
Several studies have demonstrated an association between polycystic ovary syndrome (PCOS) and the dinucleotide repeat microsatellite marker D19S884, which is located in intron 55 of the fibrillin-3 (FBN3) gene. Fibrillins, including FBN1 and 2, interact with latent transforming growth factor (TGF)-β-binding proteins (LTBP) and thereby control the bioactivity of TGFβs. TGFβs stimulate fibroblast replication and collagen production. The PCOS ovarian phenotype includes increased stromal collagen and expansion of the ovarian cortex, features feasibly influenced by abnormal fibrillin expression. To examine a possible role of fibrillins in PCOS, particularly FBN3, we undertook tagging and functional single nucleotide polymorphism (SNP) analysis (32 SNPs including 10 that generate non-synonymous amino acid changes) using DNA from 173 PCOS patients and 194 controls. No SNP showed a significant association with PCOS and alleles of most SNPs showed almost identical population frequencies between PCOS and control subjects. No significant differences were observed for microsatellite D19S884. In human PCO stroma/cortex (n = 4) and non-PCO ovarian stroma (n = 9), follicles (n = 3) and corpora lutea (n = 3) and in human ovarian cancer cell lines (KGN, SKOV-3, OVCAR-3, OVCAR-5), FBN1 mRNA levels were approximately 100 times greater than FBN2 and 200–1000-fold greater than FBN3. Expression of LTBP-1 mRNA was 3-fold greater than LTBP-2. We conclude that FBN3 appears to have little involvement in PCOS but cannot rule out that other markers in the region of chromosome 19p13.2 are associated with PCOS or that FBN3 expression occurs in other organs and that this may be influencing the PCOS phenotype.
Resumo:
Background The majority of peptide bonds in proteins are found to occur in the trans conformation. However, for proline residues, a considerable fraction of Prolyl peptide bonds adopt the cis form. Proline cis/trans isomerization is known to play a critical role in protein folding, splicing, cell signaling and transmembrane active transport. Accurate prediction of proline cis/trans isomerization in proteins would have many important applications towards the understanding of protein structure and function. Results In this paper, we propose a new approach to predict the proline cis/trans isomerization in proteins using support vector machine (SVM). The preliminary results indicated that using Radial Basis Function (RBF) kernels could lead to better prediction performance than that of polynomial and linear kernel functions. We used single sequence information of different local window sizes, amino acid compositions of different local sequences, multiple sequence alignment obtained from PSI-BLAST and the secondary structure information predicted by PSIPRED. We explored these different sequence encoding schemes in order to investigate their effects on the prediction performance. The training and testing of this approach was performed on a newly enlarged dataset of 2424 non-homologous proteins determined by X-Ray diffraction method using 5-fold cross-validation. Selecting the window size 11 provided the best performance for determining the proline cis/trans isomerization based on the single amino acid sequence. It was found that using multiple sequence alignments in the form of PSI-BLAST profiles could significantly improve the prediction performance, the prediction accuracy increased from 62.8% with single sequence to 69.8% and Matthews Correlation Coefficient (MCC) improved from 0.26 with single local sequence to 0.40. Furthermore, if coupled with the predicted secondary structure information by PSIPRED, our method yielded a prediction accuracy of 71.5% and MCC of 0.43, 9% and 0.17 higher than the accuracy achieved based on the singe sequence information, respectively. Conclusion A new method has been developed to predict the proline cis/trans isomerization in proteins based on support vector machine, which used the single amino acid sequence with different local window sizes, the amino acid compositions of local sequence flanking centered proline residues, the position-specific scoring matrices (PSSMs) extracted by PSI-BLAST and the predicted secondary structures generated by PSIPRED. The successful application of SVM approach in this study reinforced that SVM is a powerful tool in predicting proline cis/trans isomerization in proteins and biological sequence analysis.
Resumo:
Background The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
Resumo:
This paper describes the cloning and characterization of a new member of the vascular endothelial growth factor (VEGF) gene family, which we have designated VRF for VEGF-related-factor. Sequencing of cDNAs from a human fetal brain library and RT-PCR products from normal and tumor tissue cDNA pools indicate two alternatively spliced messages with open reading frames of 621 and 564 bp, respectively. The predicted proteins differ at their carboxyl ends resulting from a shift in the open reading frame. Both isoforms show strong homology to VEGF at their amino termini, but only the shorter isoform maintains homology to VEGF at its carboxyl terminus and conserves all 16 cysteine residues of VEGF165. Similarity comparisons of this isoform revealed overall protein identity of 48% and conservative substitution of 69% with VEGF189. VRF is predicted to contain a signal peptide, suggesting that it may be a secreted factor. The VRF gene maps to the D11S750 locus at chromosome band 11q13, and the protein coding region, spanning approximately 5 kb, is comprised of 8 exons that range in size from 36 to 431 bp. Exons 6 and 7 are contiguous and the two isoforms of VRF arise through alternate splicing of exon 6. VRF appears to be ubiquitously expressed as two transcripts of 2.0 and 5.5 kb; the level of expression is similar among normal and malignant tissues.
Resumo:
The CDKN2 gene, encoding the cyclin-dependent kinase inhibitor p16, is a tumour suppressor gene that maps to chromosome band 9p21-p22. The most common mechanism of inactivation of this gene in human cancers is through homozygous deletion; however, in a smaller proportion of tumours and tumour cell lines intragenic mutations occur. In this study we have compiled a database of over 120 published point mutations in the CDKN2 gene from a wide variety of tumour types. A further 50 deletions, insertions, and splice mutations in CDKN2 have also been compiled. Furthermore, we have standardised the numbering of all mutations according to the full-length 156 amino acid form of p16. From this study we are able to define several hot spots, some of which occur at conserved residues within the ankyrin domains of p16. While many of the hotspots are shared by a number of cancers, the relative importance of each position varies, possibly reflecting the role of different carcinogens in the development of certain tumours. As reported previously, the mutational spectrum of CDKN2 in melanomas differs from that of internal malignancies and supports the involvement of UV in melanoma tumorigenesis. Notably, 52% of all substitutions in melanoma-derived samples occurred at just six nucleotide positions. Nonsense mutations comprise a comparatively high proportion of mutations present in the CDKN2 gene, and possible explanations for this are discussed.
Resumo:
To evaluate the timing of mutations in BRAF (v-raf murine sarcoma viral oncogene homolog B1) during melanocytic neoplasia, we carried out mutation analysis on microdissected melanoma and nevi samples. We observed mutations resulting in the V599E amino-acid substitution in 41 of 60 (68%) melanoma metastases, 4 of 5 (80%) primary melanomas and, unexpectedly, in 63 of 77 (82%) nevi. These data suggest that mutational activation of the RAS/RAF/MAPK pathway in nevi is a critical step in the initiation of melanocytic neoplasia but alone is insufficient for melanoma tumorigenesis.
Resumo:
We have used microarray gene expression profiling and machine learning to predict the presence of BRAF mutations in a panel of 61 melanoma cell lines. The BRAF gene was found to be mutated in 42 samples (69%) and intragenic mutations of the NRAS gene were detected in seven samples (11%). No cell line carried mutations of both genes. Using support vector machines, we have built a classifier that differentiates between melanoma cell lines based on BRAF mutation status. As few as 83 genes are able to discriminate between BRAF mutant and BRAF wild-type samples with clear separation observed using hierarchical clustering. Multidimensional scaling was used to visualize the relationship between a BRAF mutation signature and that of a generalized mitogen-activated protein kinase (MAPK) activation (either BRAF or NRAS mutation) in the context of the discriminating gene list. We observed that samples carrying NRAS mutations lie somewhere between those with or without BRAF mutations. These observations suggest that there are gene-specific mutation signals in addition to a common MAPK activation that result from the pleiotropic effects of either BRAF or NRAS on other signaling pathways, leading to measurably different transcriptional changes.