21 resultados para sequences analysis technology
em Chinese Academy of Sciences Institutional Repositories Grid Portal
Resumo:
分子系统学建立在实验和计算的基础之上。DNA快速测序技术的普及为分子系统学家提供了大量数据,而序列分析技术则是探索数据发现知识的重要工具。在基因组时代,随着大量模式生物完整基因组序列的获得,分子系统学正面临着前所未有的机遇和挑战。一方面,生命之树计划有助于确定新的模式生物和开展相应的基因组计划;另一方面,模式生物的基因组计划有助于阐明它们之间的进化关系和基因组的进化模式。更为重要的是,分子系统学序列分析技术已经发展成为探索与整合基因组数据的强有力工具,从而在生命科学中发挥重要作用。事实上,分子系统学和基因组学的相互渗透正在形成一门崭新的交叉学科——系统发育基因组学。 为了奠定分子系统学研究中信息管理和数据分析工作的坚实基础,我们建立了分子系统发育分析平台。该平台为研究人员提供专业数据库服务和数据分析技术支持,以及相关的网络资源。 分子系统发育分析平台包括了3个专业数据库。第一个是DNA凭证标本数据库。该数据库中的记录包括了7项字段:英文科名、中文科名、物种拉丁名、采集人、采集号、采集地和采集时间。用户可以通过设定单个或多个字段的取值进行检索。截止2004年6月1日,该数据库共包括3491条标本记录。第二个是引物数据库。PCR引物是分子系统学实验的重要条件之一。该数据库中的记录包括3项字段:引物名称、序列内容和退火温度。用户可以通过设定单个或多个字段的取值进行检索。截止2004年6月1日,该数据库共包括170条用于扩增植物细胞核、叶绿体和线粒体基因组DNA序列的引物记录。第三个是生物计算数据库。该数据库为研究人员提供传输和保存序列分析数据和结果文件的服务。 为了确保数据库的安全性和使用性,我们开发了数据库的接口和检索工具,以及系统管理员和用户资格认证程序。通过前者,使用者可以进行数据的上传、下载、管理和检索等操作。而后者则是对不同使用者身份和权限进行设定。管理员的权限高于用户,主要负责本系统的日常维护和管理工作,以及对新增管理员和用户进行资格认证。 分析技术支持旨在帮助用户快速掌握常用的系统发育分析方法,进行有效的数据分析,从复杂的统计学算法和计算机程序中解放出来,将精力集中于计算结果的生物学解释。在该部分中,我们首先简要介绍了常用的分析方法,并且针对分子系统学中的不同问题提供了相应的解决方案。这些问题包括:系统发育重建、替代速率和分歧时间的估计、祖先分布区的重建、性状进化假说的检验、以及密码子水平适应性进化的检测。我们特别强调了似然比检验和贝叶斯推测作为方法论上的重要进展在分子系统学中所发挥的关键作用。本部分还包括大量常用的分子系统学程序或软件包及其快速使用说明和命令模块。下载安装之后,用户即可按照说明使用命令模块进行数据分析。 此外,该平台还提供了一些常用的网络资源地址,如生物信息中心、分子进化和系统发育实验室、专业期刊和相关数据库等。 最后还给出了4个应用实例,即针对特定分子系统学问题的解决方案和初步的分析结果。 第一个例子说明系统发育重建方法的应用。为了确定杨梅科的系统学位置,对6种DNA序列和叶绿体trnL-F区内的间隔性状进行了分析。单个分析表明这6种序列之间在系统学信息上存在显著差异。叶绿体基因组序列的合并分析强烈支持杨梅科和(木麻黄科,(桦木科,核果桦科))的姐妹群关系,而间隔性状的存在能够充分提高其分辨率和支持率。 第二个例子说明如何推测历史生物地理学过程。我们对壳斗目8科25属植物叶绿体基因组的trnL-F、matK、rbcL和atpB的合并序列进行了最大简约分析,得到唯一的最大简约树。基于该系统树和25属植物的地理分布数据,采用扩散-替代分析方法重建了系统树每个节点上的祖先分布区,推测了壳斗目的分布历史。结果表明,壳斗目的历史生物地理学过程由3次替代事件和20次扩散事件组成。其中最重要的替代事件是由于冈瓦纳大陆和劳亚大陆分离所导致的南青冈科及其姐妹群之间的分化。另外,在壳斗科和核心高等金缕梅类中多次发生从欧亚大陆到北美洲、甚至南美洲的平行扩散事件。 第三个例子说明如何估计分歧时间。我们仍然使用扩散-替代分析中所用的最大简约树作为分析的依据,并根据等级制似然比检验确定的最优替代模型对该系统树的支长进行了最大似然优化。似然比检验表明,该系统树不服从分子钟假说。我们以冈瓦纳大陆和劳亚大陆分离的地质事件和5个属的最早化石记录作为标定点,采用罚分似然法在没有分子钟的条件下估计了壳斗目的科间分歧时间。结果表明,绝大多数科间分歧事件都发生在白垩纪。 第四个例子说明如何检测密码子水平的适应性进化。分支间可变选择压力模型的似然比检验表明SARS冠状病毒的S基因在跨种传播过程中发生了正选择。
Resumo:
Acipenseriformes is an endangered primitive fish group, which occupies a special place in the history of ideas concerning fish evolution, even in vertebrate evolution. However, the classification and evolution of the fishes have been debated. The mitochondrial DNA (mtDNA) ND4L and partial ND4 genes were first sequenced in twelve species of the order Acipenseriformes, including endemic Chinese species. The following points were drawn from DNA sequences analysis: (i) the two species of Huso can be ascribed to Acipenser; (ii) A. dabryanus is the mostly closely related to A. sinensis, and most likely the landlocked form of A. sinensis; (iii) genus Acipenser in trans-Pacific region might have a common origin; (iv) mtDNA ND4L and ND4 genes are the ideal genetic markers for phylogenetic analysis of the order Acipenseriformes.
Resumo:
Acipenseriformes is an endangered primitive fish group, which occupies a special place in the history of ideas concerning fish evolution, even in vertebrate evolution. However, the classification and evolution of the fishes have been debated. The mitochondrial DNA (mtDNA) ND4L and partial ND4 genes were first sequenced in twelve species of the order Acipenseriformes, including endemic Chinese species. The following points were drawn from DNA sequences analysis: (i) the two species of Huso can be ascribed to Acipenser; (ii) A. dabryanus is the mostly closely related to A. sinensis, and most likely the landlocked form of A. sinensis; (iii) genus Acipenser in trans-Pacific region might have a common origin; (iv) mtDNA ND4L and ND4 genes are the ideal genetic markers for phylogenetic analysis of the order Acipenseriformes.
Resumo:
真空室内金属粒子污染是降低激光薄膜性能的一个重要因素。采用高真空残余气体分析仪,对薄膜沉积过程中的气氛进行分析。发现由黄铜制作的加热灯架在工作时会分解出Zn,在这种条件下沉积薄膜,会使薄膜中掺入金属杂质,导致薄膜激光破坏阈值降低。采用表面分析技术对薄膜的组分进行分析,证实薄膜中锌杂质的存在。激光破坏实验证明,含有锌杂质的薄膜的破坏阈值明显降低。
Resumo:
高等植物种子胚乳贮藏蛋白是种子发芽时的主要氮源,也是人类和动物食用植物蛋白的主要来源。大麦种子胚乳贮藏蛋白主要是醇溶蛋白(hordeins),占大麦胚乳总蛋白的50–60%。根据大麦醇溶蛋白的大小和组成特点,大麦醇溶蛋白被划分为三种类型:富硫蛋白亚类(B,γ-hordeins)、贫硫蛋白亚类(C-hordeins)以及高分子量蛋白亚类(D-hordeins)。B组和C组醇溶蛋白是大麦胚乳的两类主要贮藏蛋白,它们分别占大麦总醇溶蛋白成分的70–80%和10–12%。遗传分析表明,大麦B、C、D和γ-组醇溶蛋白分别是由位于大麦第五染色体1H(5)上的Hor2、Hor1、Hor3和Hor5位点编码。Hor2位点编码大量分子量相同但组成不同的B组醇溶蛋白(B-hordein)。B-hordein的种类、数量和分布是影响大麦酿造、食用及饲养品质的重要因素之一。为深入了解B-hordein基因家族的结构和染色体组织,探明Hor2位点基因表达的发育调控机制,最终达到改良禾谷类作物籽粒品质的目的,本研究以青藏高原青稞为材料,采用同源克隆法,分别克隆B-hordein基因和启动子,通过原核生物表达验证B-hordein基因功能,并利用实时定量PCR探索B-hordein基因表达时空关系,取得如下研究结果: 1. 以具有特殊B组醇溶蛋白亚基组成的9份青藏高原青稞为材料,根据GenBank中三个B-hordein基因序列(GenBank No. X03103, X53690和X53691)设计一对引物,通过PCR扩增,获得23个B-hordein基因克隆并对其进行了序列分析。核苷酸序列分析表明,所有克隆均包含完整的开放阅读框。有11个克隆都存在一个框内终止密码子,推测这11个克隆可能是假基因。推测的氨基酸序列分析表明,所有大麦B-hordein具有相似的蛋白质基本结构,均包括一个高度保守的信号肽、中间重复区以及C-端结构域。不同大麦种重复区内重复基元的数目有较大差异。青稞材料Z07–2和Z26的B-hordeins仅具有12个重复基元结构,更接近于野生大麦。这些重复基元数目的差异导致了重复区序列长度和结构的变异。这种现象极可能是由于醇溶谷蛋白基因在进化过程中染色体的不平衡交换或复制滑动所造成的。对所克隆基因和禾本科代表性醇溶谷蛋白基因进行聚类分析,结果表明所有来自栽培大麦的B-hordeins聚类成一个亚家族,来自野生大麦的B-hordeins以及普通小麦的LMW-GS聚类成另外一个亚家族,表明这两个亚家族的成员存在显著差异。此外,我们发现B-hordein基因推测的C-末端序列具有一些有规律的特征:即具有相同C-末端序列的B-hordein基因在系统发生树中聚类为同一个亚组(除BXQ053,BZ09-1,BZ26-5分别单独聚为一类外)。这个特征将有助于我们对所有B组醇溶蛋白基因家族成员进行分类,避免了在SDS-PAGE电泳图谱上仅依靠大小分类的局限性。 2. 根据上述克隆的青稞B-hordein基因的5’端序列设计三条基因特异的反向引物,以青稞Z09和Z26的基因组DNA为模板,采用SON-PCR和TAIL-PCR技术分离克隆出8个B-hordein基因的上游调控序列(命名为Z09P和Z26P)。序列分析表明,推测的TATA box位于–80 bp,CAAT–like box位于–140 bp处。此外,Z09P和Z26P中有六个序列在–300 bp处均存在一个由高度保守的EM基序和类GCN4基序构成的胚乳盒(Endosperm Box,EB),在约–560 bp处存在一个胚乳盒类似结构。而Z09P-2和Z26P-3不存在保守的胚乳盒或其类似结构,预示着这两个启动子所调控的基因表达可能受不同类型反式作用因子的调节,推测该启动子对基因的表达调控具有多样性。 3. 将B-hordein基因的开放阅读框定向克隆到表达载体pET-30a中,将其导入大肠杆菌表达菌株BL21中进行外源基因的诱导表达以验证所克隆基因的功能。结果表明仅含重组子pET-BZ07-2和pET-BZ26-5的BL21细菌有目的表达蛋白产生。在诱导3 h时的蛋白表达量最高;3 mM IPTG诱导的蛋白表达量要高于1 mM IPTG诱导的表达量。这为分离纯化B-hordein蛋白以及进一步研究其对大麦籽粒品质的影响奠定基础。 4. 根据从青稞Z09和Z26中分离克隆的B-hordein基因序列设计一对基因特异的引物,同时,选择大麦α-微管蛋白基因(GenBank no. U40042)为看家基因并设计特异引物,利用实时荧光定量PCR检测了青稞籽粒4个胚乳发育时间段的B-hordein基因表达,荧光定量结果显示:两份材料中B-hordein基因的表达量均随发育过程的进行而逐渐升高。Z09中B-hordein基因在开花后7天开始转录,而Z26开花4天后就有低水平B-hordein的表达,这表明Z26中B-hordein基因可能比Z09表达的较早或者Z09中B-hordein基因表达水平较低以致于不能被检测到。此外,在4个不同的胚乳发育时期中,Z26中B-hordein基因的表达量均高于Z09材料。在开花12天到18天的过程中,Z09和Z26中B-hordein基因的表达水平有一个急剧性的升高。这说明在不同胚乳发育时期,Hor2位点的B-hordein等位基因变异体存在mRNA的差异表达。 Seed endosperm storage proteins in higher plants are the main resources of nitrogen for germinating and plant proteins for human and animals. Barley prolamins (also called hordeins) are the major storage proteins in the endosperm and account for 50–60% of total proteins. Hordeins are classically divided into three groups: sulphur-rich (B, γ-hordeins), sulphur-poor (C-hordeins) and high molecular weight (HMW, D-hordeins) hordeins based on the size and composition. B-hordeins and C-hordeins are two major groups and each respectively account for about 70-80% and 10-12% of the total hordein fraction in barley endosperm. Genetic analysis showed that B-, C-, C-, γ-hordeins are encoded by Hor2, Hor1, Hor3 and Hor5 locus on the chromosome 1H (5). Hor2 locus is rich in alleles that encode numerous heterogeneous B-hordein polypeptides. It is reported that B-hordein species, quantity and distribution are significant factors affecting malting, food and feed quality of barley. To understand comprehensively the structure and organization of B-hordein gene family in hull-less barley and explore the developmental control mechanisms of Hor2 locus gene expression and eventually to better exploitation in crop grain quality improvement, we isolated and cloned B-hordein genes and promotors of hull-less barley from Qinghai-Tibet Plateau by PCR, and testified their expression founction in bacteria expression system and explore their spatial and temporal expression pattern by quantitative real time PCR. Our results are as followed, 1. Twenty-three copies of B-hordein gene were cloned from nine hull-less barley cultivars of Qinghai-Tibet Plateau with special B-hordein subunits and molecularly characterized by PCR, based on three B-hordein genes published previously (GenBank No. X03103, X53690 and X53691). DNA sequences analyses confirmed that the six clones all contained a full-length coding region of the barley B-hordein genes. Eleven clones all contain an in-frame stop codon and they are probably pseudogenes. The analysis of deduced amino acid sequences of the genes shows that they have similar structures including signal peptide domain, central repetitive domain, and C-terminal domain. The number of the repeats was largerly variable and resulted in polypeptides in different sizes or structures among the genes. Twelve such repeated motifs were found in Z07–2 and Z26, and they are close to those of the wild barleys, and it is most probably caused by unequal crossing-over and/or slippage during replication as suggested for the evolution of other prolamins. The relatedness of prolamin genes of barley and wheat was assessed in the phylogenetic tree based on their polypeptides comparison. Our phylogenetic analysis suggested that the predicted B-hordeins of cultivated barley formed a subfamily, while the B-hordeins of wild barleys and the two most similar sequences of LMW-GS of T. aestivum formed another subfamily. This result indicated that the members of the two subfamilys have a distinctive difference. In addition, we found the B-hordeins with identical C-terminal end sequences were clustered into a same subgroup (except BXQ053,BZ09-1 and BZ26-5 as a sole group, respectively), so we believe that B-hordein gene subfamilies possibly can be classified on the basis of the conserved C-terminal end sequences of predicted polypeptide and without the limit of SDS-PAGE protein banding patterns. 2. The specific primers were designed according to the published sequences of barley B-hordein genes from Z09 and Z26. Using total DNA isolated from them as the templates, eight clones (designated Z09Pand Z26P) of upstream sequences of the known B-hordein genes was obtained by TAIL-PCR and SON-PCR. Sequences analysis shows that the putative TATA box was present at position –80 bp and CAAT-like box at position –140 bp. Besides, a putative Endosperm Box including an Endosperm Motif (EM) and a GCN4-Like Motif was found at position –300 bp in six clones, and another Endosperm-like box was found at positon –560 bp. While the Endosperm Box or Endosperm-like box was not found in Z09P-2 and Z26P-3. This may indicate that gene expression drived by the two promtors was probably controlled by different trans-acting factors and the genetic control mechanism of corresponding gene expression may be diverse. 3. The B-hordein genic region coding for the mature peptide was cloned into expression vector pET-30a and transformed into bacterial strain BL21 for identifying gene expression fountion. Protein SDS–PAGE analysis showed that only the transformed lysate with the pET-BZ07-2 and pET-BZ26-5 constructs produced proteins related to B-group hordeins of barley, and the mounts of proteins induced by 3 mM IPTG and 3 h were higher than other conditions. This established a base for isolating and putifying B-hordein and further exploring their effects on barley grain quality. 4. The gene-specific primers of B-hordein genes from Z09 and Z26 were used for the quantification of B-hordein gene expression. The α-tubulin gene from Hordeum vulgare subsp. vulgare (GenBank accession number U40042) was used as a control gene. The result shows the transcription of the B-hordein genes in Z09 was found 7 days after flowering, while the transcription of the B-hordein genes in Z26 was found 4 days after flowering, but at a very low level, and it suggested that the B-hordein genes in Z26 probably expressed earlier than those in Z09, or the B-hordein genes in Z09 expressed at so a lower level than Z26 that it can not detected. In addition, B-hordein genes in Z26 accession showed higher expression levels than those in Z09 in four developing stages. Furthermore, a progressive increase in the expression levels of the B-hordein genes between 12 and 18 days after anthesis was observed in both Z09 and Z26. It implies that the B-hordein allelic variants encoded by Hor2 locus exist the differential expression in mRNA levels of during barley endosperm development.
Resumo:
Conventional 3D seismic exploration cannot meet the demand of high yield and high efficiency safe production in coal mine any more. Now it is urgent to improve the discovery degree of coal mine geological structures for coal production in China. Based on 3D3C seismic exploration data, multi-component seismic information is fully excavated. First systematic research on 3D3C seismic data interpretation of coal measure strata is carried out. Firstly, by analyzing the coal measure strata, the seismic-geologic model of coal measure strata is built. Shear wave logging is built by using regression analysis. Horizon calibration methods of PP-wave and PS-wave are studied and the multi-wave data are used together to interpret small faults. Using main amplitude analysis technology, small faults which cannot be found from PP-wave sections can be interpreted from the low frequency PS-wave sections. Thus, the purpose to applying PS-wave data to fine structure assistant interpretation is achieved. Secondly, PP- and PS-wave post-stack well constrained inversion methods of coal measure strata are studied. Joint PP- and PS-wave post-stack inversion flow is established. More attribute parameters, which are applied in fine lithology interpretation of coal measure strata, are obtained from combinations of the inversion results. Exploring the relation between rock with negative Poisson’s ratio and anisotropy, fracture development in coal seam are predicted. Petrophysical features of coal measure strata are studied, and the relations between elastic parameters and lithology, fluid and physical properties are established. Inversions of the physical parameters such as porosity, permeability and water saturation, which reflect lithology and fluid property, are obtained. Finally, the approaches of shear wave splitting and Thomsen parameters inversion, which provide new ideas for seismic anisotropy interpretation of coal measure strata, are studied to predict fracture development. The results of practical application indicate that the methods in this paper have good feasibility and applicability. They have positive significance for high yield and high efficiency safe production in coal mine.
Resumo:
Based on the study of sequence stratigraphy, modern sedimentary, basin analysis, and petroleum system in abrupt slop of depression, this paper builds sedimentary system and model, sandy bodies distribution, and pool-forming mechanism of subtle trap. There are some conclusions and views as follows. By a lot of well logging and seismic analysis, the author founded up the sequence stratigraphic of the abrupt slope, systematically illustrated the abrupt slope constructive framework, and pointed out that there was a special characteristics which was that south-north could be divided to several fault block and east-west could be carved up groove and the bridge in studying area. Based all these, the author divided the studying area to 3 fault block zone in which because of the groove became the basement rock channel down which ancient rivers breathed into the lake, the alluvial fan or fan delta were formed. In the paper, the author illustrated the depositional system and depositional model of abrupt slope zone, and distinguished 16 kinds of lithofacies and 3 kinds of depositional systems which were the alluvial fan and fan-delta system, lake system and the turbidite fan or turbidity current deposition. It is first time to expound completely the genetic pattern and distributing rule of the abrupt slope sandy-conglomeratic fan bodies. The abrupt slope sandy-conglomeratic fan bodies distribute around the heaves showing itself circularity shape. In studying area, the sandy-conglomeratic fan bodies mainly distribute up the southern slope of Binxian heave and Chenjiazhuang heave. There mainly are these sandy-conglomeratic fan body colony which distributes at a wide rage including the alluvial fan, sub-water fluvial and the turbidite fan or the other turbidity current deposition in the I fault block of the Wangzhuang area. In the II fault block there are fan-delta front and sub-water fluvial. And in the Binnan area, there mainly are those the alluvial fan (down the basement rock channel) and the sandy-conglomeratic fan body which formed as narrowband sub-water fluvial (the position of bridge of a nose) in the I fault block, the fan-delta front sandy-conglomeratic fan body in the H fault block and the fan-delta front and the turbidity current deposition sandy-conglomeratic fan body in the m fault block. Based on the reservoir outstanding characteristics of complex classic composition and the low texture maturity, the author comparted the reservoir micro-structure of the Sha-III and Sha-IV member to 4 types including the viscous crude cementation type, the pad cementation type, the calcite pore-funds type and the complex filling type, and hereby synthetically evaluated 4 types sandy- conglomeratic fan body reservoir. In the west-north abrupt slope zone of Dongying Depression, the crude oil source is belonging to the Sha-III and Sha-IV member, the deep oil of Lijin oilfield respectively come from the Sha-III and Sha-IV member, which belongs to the autogeny and original deposition type; and the more crude oil producing by Sha-IV member was migrated to the Wangzhuan area and Zhengjia area. The crude oil of Binnan oil-field and Shanjiasi oil-field belongs to mixed genetic. It is the first time to illustrate systematically the genetic of the viscous crude that largely being in the studying area, which are that the dissipation of the light component after pool-forming, the biological gradation action and the bath-oxidation action, these oil accumulation belonging to the secondary viscous crude accumulation. It is also the first time to compart the studying area to 5 pool-forming dynamical system that have the characteristic including the common pressure and abnormal pressure system, the self-fountain and other-fountain system and the closing and half-closing system etc. The 5 dynamical systems reciprocally interconnected via the disappearance or merger of the Ethology and the fluid pressure compartment zone, the fault and the unconformity surface, hereby formed duplicated pattern oil-gas collecting zone. Three oil-gas pool-forming pattern were founded, which included the self-fountain side-direction migrated collecting pattern, the self-fountain side-direction ladder-shape pool-forming pattern and the other-fountain pressure releasing zone migrated collecting pattern. A series of systemic sandy-conglomeratic fan bodies oil-gas predicting theory and method was founded, based on the groove-fan corresponding relation to confirm the favorable aim area, according as the characteristic of seismic-facies to identify qualitatively the sandy-conglomeratic fan bodies or its scale, used the temporal and frequency analysis technique to score the interior structure of the sandy- conglomeratic fan bodies, applied for coherent-data system analysis technology to describe the boundary of the sandy-conglomeratic fan bodies, and utilized the well logging restriction inversion technique to trace quantificational and forecast the sandy-conglomeratic fan bodies. Applied this technique, totally 15 beneficial sandy-conglomeratic fan bodies were predicted, in studying area the exploration was preferably guided, and the larger economic benefit and social benefit was acquired.
Resumo:
Recurrence plot technique of DNA sequences is established on metric representation and employed to analyze correlation structure of nucleotide strings. It is found that, in the transference of nucleotide strings, a human DNA fragment has a major correlation distance, but a yeast chromosome's correlation distance has a constant increasing. (C) 2004 Elsevier B.V All rights reserved.
Resumo:
In this study, a detailed analysis of both previously published and new data was performed to determine whether complete, or almost complete, mtDNA sequences can resolve the long-debated issue of which Asian mtDNAs were founder sequences for the Native American mtDNA pool. Unfortunately, we now know that coding region data and their analysis are not without problems. To obtain and report reasonably correct sequences does not seem to be a trivial task, and to discriminate between Asian-and Native American mtDNA ancestries may be more complex than previously believed. It is essential to take into account the effects of mutational hot spots in both the control and coding regions, so that the number of apparent Native American mtDNA founder sequences is not erroneously inflated. As we report here, a careful analysis of all available data indicates that there is very little evidence that more than five founder mtDNA sequences entered Beringia before the Last Glacial Maximum and left their traces in the current Native American mtDNA pool.
Resumo:
Background: Despite the small number of ursid species, bear phylogeny has long been a focus of study due to their conservation value, as all bear genera have been classified as endangered at either the species or subspecies level. The Ursidae family repre
Resumo:
We determined the complete mitochondrial DNA sequences for two species of surface- and cave-dwelling-cyprinid fishes, Sinocyclocheilus grahami and S. altishoulderus. Sequence comparison of 13 protein-coding genes shows that the mutation pattern of each single gene is quite similar to those of other vertebrate animal species. Analysis of the ratios of Ka/Ks at these loci between Sinocyclocheilus and two other cyprinid species (Cyprinus carpio and Procypris rabaudi) show that Ka/Ks ratios are differed, consistent with purifying selection and variation in functional constraint among genes. Bayesian analysis and maximum likelihood analysis of the concatenated mitochondrial protein sequences for 14 cyprinid taxa support the monophyly of the family Cyprininae, and further confirm the monophyly of the genus Sinocyclocheilus. The two Sinocyclocheilus species fall within the Cyprinion-Onychostoma lineage, including Cyprinus, Carassius, and Procypris, rather than among the Barbinae, as previously suggested on morphological grounds.
Sequencing, annotation and comparative analysis of nine BACs of giant panda (Ailuropoda melanoleuca)
Resumo:
A 10-fold BAC library for giant panda was constructed and nine BACs were selected to generate finish sequences. These BACs could be used as a validation resource for the de novo assembly accuracy of the whole genome shotgun sequencing reads of giant panda newly generated by the Illumina GA sequencing technology. Complete sanger sequencing, assembly, annotation and comparative analysis were carried out on the selected BACs of a joint length 878 kb. Homologue search and de novo prediction methods were used to annotate genes and repeats. Twelve protein coding genes were predicted, seven of which could be functionally annotated. The seven genes have an average gene size of about 41 kb, an average coding size of about 1.2 kb and an average exon number of 6 per gene. Besides, seven tRNA genes were found. About 27 percent of the BAC sequence is composed of repeats. A phylogenetic tree was constructed using neighbor-join algorithm across five species, including giant panda, human, dog, cat and mouse, which reconfirms dog as the most related species to giant panda. Our results provide detailed sequence and structure information for new genes and repeats of giant panda, which will be helpful for further studies on the giant panda.