204 resultados para Short homologous sequences
em Chinese Academy of Sciences Institutional Repositories Grid Portal
Resumo:
Chimeric RNAs have been reported in varieties of organisms and are conventionally thought to be produced by trans-splicing of two or more distinct transcripts. Here, we conducted a large-scale search for chimeric RNAs in the budding yeast, fruit fly, mous
Resumo:
新外显子的起源是一种重要的增加转录组和蛋白质组多样性的分子机制。 对于新外显子及其父本基因的进化和功能特征方面还有很多重要的问题有待于 解决。本研究首先在全基因组水平上鉴定在人和小鼠中产生的新外显子,随后 对这些外显子及其父本基因作进化和功能上的分析。我们发现新外显子倾向于 位于基因的UTR 区域,尤其是5’ UTR 区域,这表明可能有些新外显子的出现 与基因的表达调控相关。我们还发现,产生新外显子的基因具有较高的组织表 达特异性,其基因功能倾向于细胞调控和与外界环境相互作用。通过对外群中 直系同源基因的分析,我们的结果表明进化速率较高的基因更容易获得新的外 显子,纠正了先前认为的获得新外显子会加速基因进化速率的看法。 我们对哺乳类CDYL 基因家族中产生的新外显子进行了具体的进化分析和 功能研究。我们的结果表明CDYL 基因在哺乳类分化前在原先的基因上游区域 获得了一个新的启动子和三个新的外显子。随后在哺乳动物各个支系的分化中, CDYL 基因在小鼠,狗和人中分别独立的进化出一个新的外显子。同源比对的 结果表明,这些新外显子是通过内含子序列的外显子化这一分子机制产生。近 缘物种间的进化速率的计算结果表明这些新产生的外显子具有快速进化的模 式,并且其快速进化可能是由正选择所驱动。在人中,多种突变包括新外显子 的获得,启动子的改变,选择性剪切的发生使得人的CDYL 基因获得了一种新 的编码更长蛋白质的剪切体。在人Hela 细胞系中的实验表明,新产生的蛋白质 与原有的蛋白质相比都具有显著的转录抑制活性,但新的蛋白质的转录抑制活 性较弱,且两者之间存在相互干扰的关系。这一结果表明通过新外显子的获得 产生的新的蛋白质可以丰富原有的基因表达调控体系,使得生物体的调控网络 更加精确。 嵌合RNA 通常认为是由来源于不同的pre-mRNA 的外显子通过反式剪切连 接在一起形成的。这一现象在包括多种动物和植物中被广泛的报道。我们的研 究首先通过大规模表达序列(ESTs)的搜索,在酵母,果蝇,小鼠和人中鉴定 到了大量的嵌合RNA。这一结果表明形成嵌合RNA 在真核生物中是一种普遍 的生物学过程,是一种重要的增加转录组和蛋白质组的多样性的分子机制。对 嵌合RNA 的序列分析表明,仅有<20%的嵌合RNA 在接合处可以找到典型的剪切位点 GU-AG,可以用经典的反式剪切模型来解释其产生机制。然而有意思的 是,我们在大约一半的嵌合RNA 的供体基因之间找到了短的同源序列,这一发 现使我们提出了一种新的分子机制来解释这些嵌合RNA 的形成,我们称之为 “转录滑动”模型。在酵母我们,我们用实验的方法验证了短同源序列对形成嵌 合RNA 的必要性,有力地支持了我们这一模型。
Resumo:
Fourier spectra of 120 short coding sequences (<1 200 bp) show that not all coding sequences are characterized by 3-base periodicity. Statistical analysis suggests that whether a coding sequence has 3-base periodicity may be related to the composition and distribution of bases, the usage and the order of the amino acids of the encoded protein as well as the synonymous codon usage. Generally, the content of A+U is higher than that of G+C in non-period-3 sequences, inversely in period-3 sequences. In the three codon positions, the base distribution in the non-periodic-3 sequences is more uniform than in the periodic-3 sequences. The usage biases of the amino acids and the codons in non-period-3 sequences are weaker than that in period-3 sequences. All of these phenomena should be considered sufficiently in predicting the genes and exons of DNA sequences by Fourier analysis method.
Resumo:
Amino acid substitution matrices play an essential role in protein sequence alignment, a fundamental task in bioinformatics. Most widely used matrices, such as PAM matrices derived from homologous sequences and BLOSUM matrices derived from aligned segments of PROSITE, did not integrate conformation information in their construction. There are a few structure-based matrices, which are derived from limited data of structure alignment. Using databases PDB_SELECT and DSSP, we create a database of sequence-conformation blocks which explicitly represent sequence-structure relationship. Members in a block are identical in conformation and are highly similar in sequence. From this block database, we derive a conformation-specific amino acid substitution matrix CBSM60. The matrix shows an improved performance in conformational segment search and homolog detection.
Resumo:
The origin of new structures and functions is an important process in evolution. In the past decades, we have obtained some preliminary knowledge of the origin and evolution of new genes. However, as the basic unit of genes, the origin and evolution of exons remain unclear. Because young exons retain the footprints of origination, they can be good materials for studying origin and evolution of new exons. In this paper, we report two young exons in a zinc finger protein gene of rodents. Since they are unique sequences in mouse and rat genome and no homologous sequences were found in the orthologous genes of human and pig, the young exons might originate after the divergence of primates and rodents through exonization of intronic sequences. Strong positive selection was detected in the new exons between mouse and rat, suggesting that these exons have undergone significant functional divergence after the separation of the two species. On the other hand, population genetics data of mouse demonstrate that the new exons have been subject to functional constraint, indicating an important function of the new exons in mouse. Functional analyses suggest that these new exons encode a nuclear localization signal peptide, which may mediate new ways of nuclear protein transport. To our knowledge, this is the first example of the origin and evolution of young exons.
Resumo:
We sequenced partial mitochondrial 16S ribosomal DNA (16S rDNA) of 18 firefly species from Southwest of China. Combined with homologous sequences previously reported, phylogenetic trees including Japanese, Korean and Chinese species were reconstructed by
Resumo:
C1q is the first subcomponent of classical pathway in the complement system and a major link between innate and acquired immunities. The globular (gC1q) domain similar with C1q was also found in many non-complement C1q-domain-containing (C1qDC) proteins which have similar crystal structure to that of the multifunctional tumor necrosis factor (TNF) ligand family, and also have diverse functions. In this study, we identified a total of 52 independent gene sequences encoding C1q-domain-containing proteins through comprehensive searches of zebrafish genome, cDNA and EST databases. In comparison to 31 orthologous genes in human and different numbers in other species, a significant selective pressure was suggested during vertebrate evolution. Domain organization of C1q-domain-containing (C1qDC) proteins mainly includes a leading signal peptide, a collagen-like region of variable length, and a C-terminal C1q domain. There are 11 highly conserved residues within the C1q domain, among which 2 are invariant within the zebrafish gene set. A more extensive database searches also revealed homologous C1qDC proteins in other vertebrates, invertebrates and even bacterium, but no homologous sequences for encoding C1qDC proteins were found in many species that have a more recent evolutionary history with zebrafish. Therefore, further studies on C1q-domain-containing genes among different species will help us understand evolutionary mechanism of innate and acquired immunities.
Resumo:
A systemic study was initiated to identify stage-specific expression genes in fish embryogenesis by using suppression subtractive hybridization (SSH) technique. In this study, we presented a preliminary result on screen for stage-specific expression genes between tail bud stage (TBS) and heartbeat beginning stage (HBS) in gynogenetic silver crucian carp (Carassius auratus gibelio). Two SSH plasmid libraries specific for TBS embryos and HBS embryos were constructed, and stage-specific expression genes were screened between the two stages. 1963 TBS positive clones and 2466 HBS positive clones were sampled to PCR amplification, and 1373 TBS and 1809 HBS PCR positive clones were selected to carry out dot blots. 169 TBS dot blot positive clones and 272 HBS dot blot positive clones were sequenced. Searching GenBank by using these nucleotide sequences indicated that most of the TBS dot blot positive clones could not be found homologous sequences in the database, while known genes were mainly detected from HBS dot blot positive clones. Of the 79 known genes, 20 were enzymes or kinases involved in important metabolism of embryonic development. Moreover, specific expressions of partial genes were further confirmed by virtual northern blots. This study is the first step for making a large attempt to study temporal and spatial control of gene expression in the gynogenetic fish embryogenesis.
Resumo:
Random amplified polymorphic DNA (RAPD) molecular markers specific for one, two or three clones have been identified from five gynogenetic clones of silver crucian carp (Carassius auratus gibelio Bloch) using RAPD markers developed earlier. In this study, three RAPD markers (RA1-PA, RA2-EF and RA4-D) produced by Opj-1, and two RAPD DNA fragments (RA3-PAD and RA5-D) produced by Opj-7, were selected for molecular cloning and sequencing. Sequence data indicated that there were identical 801-bp nucleotide sequences in the shared marker RA1-PA cloned respectively from clones P and A, and the shared marker RA2-EF (which was cloned from clones E and F), were also of identical 958-by nucleotide sequences. The nucleotide sequences of the shared marker RA3-PAD fragments were also similar for 1181 by among clones P, A and D. The specific fragment RA4-D was composed of 628 bp, and the fragment RA5-D from clone D contained 385 nucleotides. According to the nucleotide sequences, we designed and synthesized five pairs of sequence characterized amplified regions (SCAR) primers to identify the specific fragments in these gynogenetic clones of silver crucian carp. Only individuals from clones P and A amplified a specific band using a pair of SCI-PA primers synthesized according to the marker RA1-PA sequences, whereas no products were detected in individuals from clones D, E and F. The PCR products amplified using SC2-EF and SC3-PAD primers were as expected. Furthermore, the pair of SC4-D primers amplified specific bands only in individuals from clone D, although weak bands could be produced in all individuals of the five clones when lower annealing temperatures were used. However, an additional pair of SC5-D primers designed from the RA5-D marker sequences could amplify a DNA band in individuals from clones P, A and D, and the same weak band was produced in clone E, whereas no products were detected in individuals from clone F. Searches in GenBank revealed that the 385-bp DNA fragment from RA5-D was homologous to the 5' end of gonadotropin I beta subunit 2 gene and growth hormone gene. No homologous sequences were found for other markers in GenBank. The SCAR markers identified in this study will offer a powerful, easy, and rapid method for discrimination of different clones and for genetic analyses that examine their origins and unique reproductive modes in crucian carp. Furthermore, they will likely benefit future selective breeding programs as reliable and reproducible molecular markers. (C) 2001 Elsevier Science B.V. All rights reserved.
Resumo:
In this work, crystallization and melting behavior of metallocene ethylene/alpha-olefin copolymers were investigated by differential scanning calorimetry (DSC) and atomic force microscopy (AFM). The results indicated that the crystallization and melting temperatures for all the samples were directly related to the long ethylene sequences instead of the average sequence length (ASL), whereas the crystallization enthalpy and crystallinity were directly related to ASL, that is, both parameters decreased with a decreasing ASL. Multiple melting peaks were analyzed by thermal analysis. Three phenomena contributed to the multiple melting behaviors after isothermal crystallization, that is, the melting of crystals formed during quenching, the melting-recrystallization process, and the coexistence of different crystal morphologies. Two types of crystal morphologies could coexist in samples having a high comonomer content after isothermal crystallization. They were the chain-folded lamellae formed by long ethylene sequences and the bundlelike crystals formed by short ethylene sequences. The coexistence phenomenon was further proved by the AFM morphological observation.
Resumo:
A metric representation of DNA sequences is borrowed from symbolic dynamics. In view of this method, the pattern seen in the chaos game representation of DNA sequences is explained as the suppression of certain nucleotide strings in the DNA sequences. Frequencies of short nucleotide strings and suppression of the shortest ones in the DNA sequences can be determined by using the metric representation.
Resumo:
Features of homologous relationship of proteins can provide us a general picture of protein universe, assist protein design and analysis, and further our comprehension of the evolution of organisms. Here we carried Out a Study of the evolution Of protein molecules by investigating homologous relationships among residue segments. The motive was to identify detailed topological features of homologous relationships for short residue segments in the whole protein universe. Based on the data of a large number of non-redundant Proteins, the universe of non-membrane polypeptide was analyzed by considering both residue mutations and structural conservation. By connecting homologous segments with edges, we obtained a homologous relationship network of the whole universe of short residue segments, which we named the graph of polypeptide relationships (GPR). Since the network is extremely complicated for topological transitions, to obtain an in-depth understanding, only subgraphs composed of vital nodes of the GPR were analyzed. Such analysis of vital subgraphs of the GPR revealed a donut-shaped fingerprint. Utilization of this topological feature revealed the switch sites (where the beginning of exposure Of previously hidden "hot spots" of fibril-forming happens, in consequence a further opportunity for protein aggregation is Provided; 188-202) of the conformational conversion of the normal alpha-helix-rich prion protein PrPC to the beta-sheet-rich PrPSc that is thought to be responsible for a group of fatal neurodegenerative diseases, transmissible spongiform encephalopathies. Efforts in analyzing other proteins related to various conformational diseases are also introduced. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
As a basic tool of modern biology, sequence alignment can provide us useful information in fold, function, and active site of protein. For many cases, the increased quality of sequence alignment means a better performance. The motivation of present work is to increase ability of the existing scoring scheme/algorithm by considering residue–residue correlations better. Based on a coarse-grained approach, the hydrophobic force between each pair of residues is written out from protein sequence. It results in the construction of an intramolecular hydrophobic force network that describes the whole residue–residue interactions of each protein molecule, and characterizes protein's biological properties in the hydrophobic aspect. A former work has suggested that such network can characterize the top weighted feature regarding hydrophobicity. Moreover, for each homologous protein of a family, the corresponding network shares some common and representative family characters that eventually govern the conservation of biological properties during protein evolution. In present work, we score such family representative characters of a protein by the deviation of its intramolecular hydrophobic force network from that of background. Such score can assist the existing scoring schemes/algorithms, and boost up the ability of multiple sequences alignment, e.g. achieving a prominent increase (50%) in searching the structurally alike residue segments at a low identity level. As the theoretical basis is different, the present scheme can assist most existing algorithms, and improve their efficiency remarkably.
Resumo:
Three short-chain neurotoxins named NT-I, NT-II, and NT-III were purified from the venom of Naja kaouthia, a snake distributed throughout the south of Yunnan province, China, by a series of chromatographic steps, including an FPLC Resource S column. Their molecular weights, determined by matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) MS, were 6952.19 Da, 6854.92 Da, and 6828.80 Da, respectively. NT-I consisted of 62 amino acid residues, and the other two consisted of 61 amino acid residues, including 8 cysteines. After hydrolysis by endoproteinase Glu-C, their primary sequences were determined. A test of their activities demonstrated that they effectively inhibited muscle contractions induced by electric stimulation. Furthermore, the extent of inhibition caused by NT-II and NT-III was less than that of NT-I. The IC(50)s were 0.04 mug/ml, 0.20 mug/ml, and 0.23 mug/ml for NT-I, NT-II, and NT-III, respectively. Compared with NT-II and NT-III, the higher activity of NT-I may be a result of the amino acid residue substitution Ile36 to Arg36.
Resumo:
Fringillidae is a large and diverse family of Passeriformes. So far, however, Fringillidae relationships deduced from morphological features and by a number of molecular approaches have remained unproven. Recently, much attention has been attracted to mitochondrial tRNA genes, whose sequence and secondary structural characteristics have shown to be useful for Acrodont Lizards and deep-branch phylogenetic studies. In order to identify useful phylogenetic markers and test Fringillidae relationships, we have sequenced three major clusters of mitochondrial tRNA genes from 15 Fringillidae, taxa. A coincident tree, with coturnix as outgroup, was obtained through Maximum-likelihood method using combined dataset of 11 mitochondrial tRNA gene sequences. The result was similar to that through Neighbor-joining but different from Maximum-parsimony methods. Phylogenetic trees constructed with stem-region sequences of 11 genes had many different topologies and lower confidence than with total sequences. On the other hand, some secondary structural characteristics may provide phylogenetic information on relatively short internal branches at under-genus level. In summary, our data indicate that mitochondrial tRNA genes can achieve high confidence on Fringillidae phylogeny at subfamily level, and stem-region sequences may be suitable only at above-family level. Secondary structural characteristics may also be useful to resolve phylogenetic relationship between different genera of Fringillidae with good performance.