954 resultados para Character coding
Resumo:
Starting from a biologically active recombinant DNA clone of exogenous unintegrated GR mouse mammary tumor virus, we have generated three subclones of PstI fragments of 1.45, 1.1, and 2.0 kb in the plasmid vector PBR322. The nucleotide sequence has been determined for the clone of 1.45 kb which includes almost the complete region of the long terminal repeat (LTR) plus an adjacent stretch of unique sequence DNA. A short region of the 2.0 kb clone, containing the beginning of the LTR, has also been sequenced. Starting with the A of an initiation codon outside the LTR, we detected an open reading frame of 960 nucleotides, potentially coding for a protein of 320 amino acids (36K). Two hundred nucleotides downstream from the termination codon, and approximately 25 nucleotides upstream from the presumptive initiation site of viral RNA synthesis, we found a promoter-like sequence. The sequence AGTAAA was detected approximately 15-20 nucleotides upstream from the 3' end of virion RNA and probably serves as a polyadenylation signal. The 1.45 kb PstI fragment has been transfected into Ltk- cells together with a plasmid containing the thymidine kinase gene of herpes simplex virus. The virus-specific RNA synthesis detected in a Tk+ cell clone was strongly stimulated by the addition of dexamethasone.
Resumo:
The opportunistic pathogen Pseudomonas aeruginosa PAO1 has a remarkable capacity to adapt to various environments and to survive with limited nutrients. Here, we report the discovery and characterization of a novel small non-coding RNA: NrsZ (nitrogen-regulated sRNA). We show that under nitrogen limitation, NrsZ is induced by the NtrB/C two component system, an important regulator of nitrogen assimilation and P. aeruginosa's swarming motility, in concert with the alternative sigma factor RpoN. Furthermore, we demonstrate that NrsZ modulates P. aeruginosa motility by controlling the production of rhamnolipid surfactants, virulence factors notably needed for swarming motility. This regulation takes place through the post-transcriptional control of rhlA, a gene essential for rhamnolipids synthesis. Interestingly, we also observed that NrsZ is processed in three similar short modules, and that the first short module encompassing the first 60 nucleotides is sufficient for NrsZ regulatory functions.
Resumo:
Cell-free translation of total RNA isolated from vaccinia virus-infected cells late in infection results in a complex mixture of polypeptides. A monospecific antibody directed against one of the major structural proteins of the virus particle immunoprecipitated a single polypeptide with a molecular weight of 11,000 (11K) from this mixture. Immunoprecipitation was therefore used to identify the structural polypeptide among the in vitro translation products of RNA purified by hybridization selection to restriction fragments of the vaccinia virus genome. This allowed us to map the mRNA coding for the 11K polypeptide to the extreme left-hand end of the HindIII E fragment. Detailed transcriptional mapping of this region of the genome by nuclease S1 analysis revealed the presence of a late RNA transcribed from the rightward-reading strand. Its 5' end mapped at ca. 130 base pairs to the left of the HindIII site at the junction between the HindIII F and E fragments. The map position of this RNA coincided precisely with the map position of the late message coding for the 11K polypeptide.
Resumo:
Cardiovascular diseases and in particular heart failure are major causes of morbidity and mortality in the Western world. Recently, the notion of promoting cardiac regeneration as a means to replace lost cardiomyocytes in the damaged heart has engendered considerable research interest. These studies envisage the utilization of both endogenous and exogenous cellular populations, which undergo highly specialized cell fate transitions to promote cardiomyocyte replenishment. Such transitions are under the control of regenerative gene regulatory networks, which are enacted by the integrated execution of specific transcriptional programs. In this context, it is emerging that the non-coding portion of the genome is dynamically transcribed generating thousands of regulatory small and long non-coding RNAs, which are central orchestrators of these networks. In this review, we discuss more particularly the biological roles of two classes of regulatory non-coding RNAs, i.e. microRNAs and long non-coding RNAs, with a particular emphasis on their known and putative roles in cardiac homeostasis and regeneration. Indeed, manipulating non-coding RNA-mediated regulatory networks could provide keys to unlock the dormant potential of the mammalian heart to regenerate. This should ultimately improve the effectiveness of current regenerative strategies and discover new avenues for repair. This article is part of a Special Issue entitled: Cardiomyocyte Biology: Cardiac Pathways of Differentiation, Metabolism and Contraction.
Resumo:
A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.
Resumo:
The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.
Resumo:
L'objectiu d'aquest informe és presentar l'aplicació d'una sèrie de propostes sobre transcripció, etiquetatge i codificació a dos corpus: el corpus bilingüe LC (La Canonja (Català-Espanyol)) i el corpus trilingüe CSCD (Code-switching as Communicative Design (Català-Espanyol-Anglès)). Aquestes propostes, que constitueixen l'aportació de l'equip IULA-LIPPS (Language Interaction in Plurilingual and Plurilectal Speakers) al manual de codificació del sistema LIDES (Language Interaction Database Exchange System), adoptat pel grup europeu LIPPS, poden ser útils per transcriure, etiquetar i codificar dades provinents de llengües tipològicament properes i distants.
Resumo:
A character network represents relations between characters from a text; the relations are based on text proximity, shared scenes/events, quoted speech, etc. Our project sketches a theoretical framework for character network analysis, bringing together narratology, both close and distant reading approaches, and social network analysis. It is in line with recent attempts to automatise the extraction of literary social networks (Elson, 2012; Sack, 2013) and other studies stressing the importance of character- systems (Woloch, 2003; Moretti, 2011). The method we use to build the network is direct and simple. First, we extract co-occurrences from a book index, without the need for text analysis. We then describe the narrative roles of the characters, which we deduce from their respective positions in the network, i.e. the discourse. As a case study, we use the autobiographical novel Les Confessions by Jean-Jacques Rousseau. We start by identifying co-occurrences of characters in the book index of our edition (Slatkine, 2012). Subsequently, we compute four types of centrality: degree, closeness, betweenness, eigenvector. We then use these measures to propose a typology of narrative roles for the characters. We show that the two parts of Les Confessions, written years apart, are structured around mirroring central figures that bear similar centrality scores. The first part revolves around the mentor of Rousseau; a figure of openness. The second part centres on a group of schemers, depicting a period of deep paranoia. We also highlight characters with intermediary roles: they provide narrative links between the societies in the life of the author. The method we detail in this complete case study of character network analysis can be applied to any work documented by an index. Un réseau de personnages modélise les relations entre les personnages d'un récit : les relations sont basées sur une forme de proximité dans le texte, l'apparition commune dans des événements, des citations dans des dialogues, etc. Notre travail propose un cadre théorique pour l'analyse des réseaux de personnages, rassemblant narratologie, close et distant reading, et analyse des réseaux sociaux. Ce travail prolonge les tentatives récentes d'automatisation de l'extraction de réseaux sociaux tirés de la littérature (Elson, 2012; Sack, 2013), ainsi que les études portant sur l'importance des systèmes de personnages (Woloch, 2003; Moretti, 2011). La méthode que nous utilisons pour construire le réseau est directe et simple. Nous extrayons les co-occurrences d'un index sans avoir recours à l'analyse textuelle. Nous décrivons les rôles narratifs des personnages en les déduisant de leurs positions relatives dans le réseau, donc du discours. Comme étude de cas, nous avons choisi le roman autobiographique Les Confessions, de Jean- Jacques Rousseau. Nous déduisons les co-occurrences entre personnages de l'index présent dans l'édition Slatkine (Rousseau et al., 2012). Sur le réseau obtenu, nous calculons quatre types de centralité : le degré, la proximité, l'intermédiarité et la centralité par vecteur propre. Nous utilisons ces mesures pour proposer une typologie des rôles narratifs des personnages. Nous montrons que les deux parties des Confessions, écrites à deux époques différentes, sont structurées autour de deux figures centrales, qui obtiennent des mesures de centralité similaires. La première partie est construite autour du mentor de Rousseau, qui a symbolisé une grande ouverture. La seconde partie se focalise sur un groupe de comploteurs, et retrace une période marquée par la paranoïa chez l'auteur. Nous mettons également en évidence des personnages jouant des rôles intermédiaires, et de fait procurant un lien narratif entre les différentes sociétés couvrant la vie de l'auteur. La méthode d'analyse des réseaux de personnages que nous décrivons peut être appliquée à tout texte de fiction comportant un index.
Resumo:
BACKGROUND: Conserved non-coding sequences in the human genome are approximately tenfold more abundant than known genes, and have been hypothesized to mark the locations of cis-regulatory elements. However, the global contribution of conserved non-coding sequences to the transcriptional regulation of human genes is currently unknown. Deeply conserved elements shared between humans and teleost fish predominantly flank genes active during morphogenesis and are enriched for positive transcriptional regulatory elements. However, such deeply conserved elements account for <1% of the conserved non-coding sequences in the human genome, which are predominantly mammalian. RESULTS: We explored the regulatory potential of a large sample of these 'common' conserved non-coding sequences using a variety of classic assays, including chromatin remodeling, and enhancer/repressor and promoter activity. When tested across diverse human model cell types, we find that the fraction of experimentally active conserved non-coding sequences within any given cell type is low (approximately 5%), and that this proportion increases only modestly when considered collectively across cell types. CONCLUSIONS: The results suggest that classic assays of cis-regulatory potential are unlikely to expose the functional potential of the substantial majority of mammalian conserved non-coding sequences in the human genome.
Resumo:
The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.
Resumo:
We have mapped the genes coding for two major structural polypeptides of the vaccinia virus core by hybrid selection and transcriptional mapping. First, RNA was selected by hybridization to restriction fragments of the vaccinia virus genome, translated in vitro and the products were immunoprecipitated with antibodies against the two polypeptides. This approach allowed us to map the genes to the left hand end of the largest Hind III restriction fragment of 50 kilobase pairs. Second, transcriptional mapping of this region of the genome revealed the presence of the two expected RNAs. Both RNAs are transcribed from the leftward reading strand and the 5'-ends of the genes are separated by about 7.5 kilobase pairs of DNA. Thus, two genes encoding structural polypeptides with a similar location in the vaccinia virus particle are clustered at approximately 105 kilobase pairs from the left hand end of the 180 kilobase pair vaccinia virus genome.
Resumo:
Canonical correspondence analysis and redundancy analysis are two methods of constrained ordination regularly used in the analysis of ecological data when several response variables (for example, species abundances) are related linearly to several explanatory variables (for example, environmental variables, spatial positions of samples). In this report I demonstrate the advantages of the fuzzy coding of explanatory variables: first, nonlinear relationships can be diagnosed; second, more variance in the responses can be explained; and third, in the presence of categorical explanatory variables (for example, years, regions) the interpretation of the resulting triplot ordination is unified because all explanatory variables are measured at a categorical level.
Resumo:
We consider adaptive sequential lossy coding of bounded individual sequences when the performance is measured by the sequentially accumulated mean squared distortion. Theencoder and the decoder are connected via a noiseless channel of capacity $R$ and both are assumed to have zero delay. No probabilistic assumptions are made on how the sequence to be encoded is generated. For any bounded sequence of length $n$, the distortion redundancy is defined as the normalized cumulative distortion of the sequential scheme minus the normalized cumulative distortion of the best scalarquantizer of rate $R$ which is matched to this particular sequence. We demonstrate the existence of a zero-delay sequential scheme which uses common randomization in the encoder and the decoder such that the normalized maximum distortion redundancy converges to zero at a rate $n^{-1/5}\log n$ as the length of the encoded sequence $n$ increases without bound.
Resumo:
Gene expression changes may underlie much of phenotypic evolution. The development of high-throughput RNA sequencing protocols has opened the door to unprecedented large-scale and cross-species transcriptome comparisons by allowing accurate and sensitive assessments of transcript sequences and expression levels. Here, we review the initial wave of the new generation of comparative transcriptomic studies in mammals and vertebrate outgroup species in the context of earlier work. Together with various large-scale genomic and epigenomic data, these studies have unveiled commonalities and differences in the dynamics of gene expression evolution for various types of coding and non-coding genes across mammalian lineages, organs, developmental stages, chromosomes and sexes. They have also provided intriguing new clues to the regulatory basis and phenotypic implications of evolutionary gene expression changes.