970 resultados para Complex Symbolic Sequence
Resumo:
El Trabajo de Fin de Grado aborda el tema del Descubrimiento de Conocimiento en series numéricas temporales, abordando el análisis de las mismas desde el punto de vista de la semántica de las series. La gran mayoría de trabajos realizados hasta la fecha en el campo del análisis de series temporales proponen el análisis numérico de los valores de la serie, lo que permite obtener buenos resultados pero no ofrece la posibilidad de formular las conclusiones de forma que se puedan justificar e interpretar los resultados obtenidos. Por ello, en este trabajo se pretende crear una aplicación que permita realizar el análisis de las series temporales desde un punto de vista cualitativo, en contraposición al tradicional método cuantitativo. De esta forma, quedarán recogidos todos los elementos relevantes de la serie temporal que puedan servir de estudio en un futuro. Para abordar el objetivo propuesto se plantea un mecanismo para extraer de la serie temporal la información que resulta de interés para su análisis. Para poder hacerlo, primero se formaliza el conjunto de comportamientos relevantes del dominio, que serán los símbolos a mostrar en la salida de la aplicación. Así, el método que se ha diseñado e implementado transformará una serie temporal numérica en una secuencia simbólica que recoge toda la semántica de la serie temporal de partida y resulta más intuitiva y fácil de interpretar. Una vez que se dispone de un mecanismo para transformar las series numéricas en secuencias simbólicas, se pueden plantear todas las tareas de análisis sobre dichas secuencias de símbolos. En este trabajo, aunque no se entra en este post-análisis de estas series, sí se plantean distintos campos en los que se puede avanzar en el futuro. Por ejemplo, se podría hacer una medida de la similitud entre dos secuencias simbólicas como punto de partida para la tarea de comparación o la creación de modelos de referencia para análisis posteriores de las series temporales. ---ABSTRACT---This Final-year Project deals with the topic of Knowledge Discovery in numerical time series, addressing time series analysis from the viewpoint of the semantics of the series. Most of the research conducted to date in the field of time series analysis recommends analysing the values of the series numerically. This provides good results but prevents the conclusions from being formulated to allow justification and interpretation of the results. Thus, the purpose of this project is to create an application that allows the analysis of time series, from a qualitative point of view rather than a quantitative one. This way, all the relevant elements of the time series will be gathered for future studies. The design of a mechanism to extract the information that is of interest from the time series is the first step towards achieving the proposed objective. To do this, all the key behaviours in the domain are set, which will be the symbols shown in the output. The designed and implemented method transforms a numerical time series into a symbolic sequence that takes in all the semantics of the original time series and is more intuitive and easier to interpret. Once a mechanism for transforming the numerical series into symbolic sequences is created, the symbolic sequences are ready for analysis. Although this project does not cover a post-analysis of these series, it proposes different fields in which research can be done in the future. For instance, comparing two different sequences to measure the similarities between them, or the creation of reference models for further analysis of time series.
Resumo:
We demonstrate that the ligand pocket of a lipocalin from Pieris brassicae, the bilin-binding protein (BBP), can be reshaped by combinatorial protein design such that it recognizes fluorescein, an established immunological hapten. For this purpose 16 residues at the center of the binding site, which is formed by four loops on top of an eight-stranded β-barrel, were subjected to random mutagenesis. Fluorescein-binding BBP variants were then selected from the mutant library by bacterial phage display. Three variants were identified that complex fluorescein with high affinity, exhibiting dissociation constants as low as 35.2 nM. Notably, one of these variants effects almost complete quenching of the ligand fluorescence, similarly as an anti-fluorescein antibody. Detailed ligand-binding studies and site-directed mutagenesis experiments indicated (i) that the molecular recognition of fluorescein is specific and (ii) that charged residues at the center of the pocket are responsible for tight complex formation. Sequence comparison of the BBP variants directed against fluorescein with the wild-type protein and with further variants that were selected against several other ligands revealed that all of the randomized amino acid positions are variable. Hence, a lipocalin can be used for generating molecular pockets with a diversity of shapes. We term this class of engineered proteins “anticalins.” Their one-domain scaffold makes them a promising alternative to antibodies to create a stable receptor protein for a ligand of choice.
Resumo:
During V(D)J recombination, RAG (recombination-activating gene) complex cleaves DNA based on sequence specificity. Besides its physiological function, RAG has been shown to act as a structure-specific nuclease. Recently, we showed that the presence of cytosine within the single-stranded region of heteroduplex DNA is important when RAGs cleave on DNA structures. In the present study, we report that heteroduplex DNA containing a bubble region can be cleaved efficiently when present along with a recombination signal sequence (RSS) in cis or trans configuration. The sequence of the bubble region influences RAG cleavage at RSS when present in cis. We also find that the kinetics of RAG cleavage differs between RSS and bubble, wherein RSS cleavage reaches maximum efficiency faster than bubble cleavage. In addition, unlike RSS, RAG cleavage at bubbles does not lead to cleavage complex formation. Finally, we show that the ``nonamer binding region,'' which regulates RAG cleavage on RSS, is not important during RAG activity in non-B DNA structures. Therefore, in the current study, we identify the possible mechanism by which RAG cleavage is regulated when it acts as a structure-specific nuclease. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
The sequence distribution of the monomeric units in the styrene-acrylic acid copolymer has been obtained by calculation. The probability of long sequences of styrene increases with an increase in the content of the monomer in the copolymer. The highest distribution of short sequences of styrene takes place for the copolymer containing equimolecular amounts of styrene and acrylic acid. The copolymer which has this latter structure is inadequate for the synthesis of highly active supported complexes. When the distributions of long and short sequences of styrene are approximately equal, the activity of the Nd and Fe prepared polymer complexes is higher.
Resumo:
The identification of chemical mechanism that can exhibit oscillatory phenomena in reaction networks are currently of intense interest. In particular, the parametric question of the existence of Hopf bifurcations has gained increasing popularity due to its relation to the oscillatory behavior around the fixed points. However, the detection of oscillations in high-dimensional systems and systems with constraints by the available symbolic methods has proven to be difficult. The development of new efficient methods are therefore required to tackle the complexity caused by the high-dimensionality and non-linearity of these systems. In this thesis, we mainly present efficient algorithmic methods to detect Hopf bifurcation fixed points in (bio)-chemical reaction networks with symbolic rate constants, thereby yielding information about their oscillatory behavior of the networks. The methods use the representations of the systems on convex coordinates that arise from stoichiometric network analysis. One of the methods called HoCoQ reduces the problem of determining the existence of Hopf bifurcation fixed points to a first-order formula over the ordered field of the reals that can then be solved using computational-logic packages. The second method called HoCaT uses ideas from tropical geometry to formulate a more efficient method that is incomplete in theory but worked very well for the attempted high-dimensional models involving more than 20 chemical species. The instability of reaction networks may lead to the oscillatory behaviour. Therefore, we investigate some criterions for their stability using convex coordinates and quantifier elimination techniques. We also study Muldowney's extension of the classical Bendixson-Dulac criterion for excluding periodic orbits to higher dimensions for polynomial vector fields and we discuss the use of simple conservation constraints and the use of parametric constraints for describing simple convex polytopes on which periodic orbits can be excluded by Muldowney's criteria. All developed algorithms have been integrated into a common software framework called PoCaB (platform to explore bio- chemical reaction networks by algebraic methods) allowing for automated computation workflows from the problem descriptions. PoCaB also contains a database for the algebraic entities computed from the models of chemical reaction networks.
Resumo:
Picosecond transient absorption (TA) and time-resolved infrared (TRIR) measurements of rac-[Cr(phen)2(dppz)]3+ (1) intercalated into double-stranded guanine-containing DNA reveal that the excited state is very rapidly quenched. As no evidence was found for the transient electron transfer products, it is proposed that the back electron transfer reaction must be even faster (<3 ps).
Resumo:
The IMGT/HLA Database (www.ebi.ac.uk/imgt/hla/) specialises in sequences of polymorphic genes of the HLA system, the human major histocompatibility complex (MHC). The HLA complex is located within the 6p21.3 region on the short arm of human chromosome 6 and contains more than 220 genes of diverse function. Many of the genes encode proteins of the immune system and these include the 21 highly polymorphic HLA genes, which influence the outcome of clinical transplantation and confer susceptibility to a wide range of non-infectious diseases. The database contains sequences for all HLA alleles officially recognised by the WHO Nomenclature Committee for Factors of the HLA System and provides users with online tools and facilities for their retrieval and analysis. These include allele reports, alignment tools and detailed descriptions of the source cells. The online IMGT/HLA submission tool allows both new and confirmatory sequences to be submitted directly to the WHO Nomenclature Committee. The latest version (release 1.7.0 July 2000) contains 1220 HLA alleles derived from over 2700 component sequences from the EMBL/GenBank/DDBJ databases. The HLA database provides a model which will be extended to provide specialist databases for polymorphic MHC genes of other species.
Resumo:
Tissue and cell-type specific expression of the rat osteocalcin (rOC) gene involves the interplay of multiple transcriptional regulatory factors. In this report we demonstrate that AML-1 (acute myeloid leukemia-1), a DNA-binding protein whose genes are disrupted by chromosomal translocations in several human leukemias, interacts with a sequence essential for enhancing tissue-restricted expression of the rOC gene. Deletion analysis of rOC promoter-chloramphenicol acetyltransferase constructs demonstrates that an AML-1-binding sequence within the proximal promoter (-138 to -130 nt) contributes to 75% of the level of osteocalcin gene expression. The activation potential of the AML-1-binding sequence has been established by overexpressing AML-1 in osteoblastic as well as in nonosseous cell lines. Overexpression not only enhances rOC promoter activity in osteoblasts but also mediates OC promoter activity in a nonosseous human fibroblastic cell line. A probe containing this site forms a sequence specific protein-DNA complex with nuclear extracts from osteoblastic cells but not from nonosseous cells. Antisera supershift experiments indicate the presence of AML-1 and its partner protein core-binding factor beta in this osteoblast-restricted complex. Mutations of the critical AML-1-binding nucleotides abrogate formation of the complex and strongly diminish promoter activity. These results indicate that an AML-1 related protein is functional in cells of the osteoblastic lineage and that the AML-1-binding site is a regulatory element important for osteoblast-specific transcriptional activation of the rOC gene.
Resumo:
Transmission of human immunodeficiency virus 1 (HIV-1) from an infected women to her offspring during gestation and delivery was found to be influenced by the infant's major histocompatibility complex class II DRB1 alleles. Forty-six HIV-infected infants and 63 seroreverting infants, born with passively acquired anti-HIV antibodies but not becoming detectably infected, were typed by an automated nucleotide-sequence-based technique that uses low-resolution PCR to select either the simpler Taq or the more demanding T7 sequencing chemistry. One or more DR13 alleles, including DRB1*1301, 1302, and 1303, were found in 31.7% of seroreverting infants and 15.2% of those becoming HIV-infected [OR (odds ratio) = 2.6 (95% confidence interval 1.0-6.8); P = 0.048]. This association was influenced by ethnicity, being seen more strongly among the 80 Black and Hispanic children [OR = 4.3 (1.2-16.4); P = 0.023], with the most pronounced effect among Black infants where 7 of 24 seroreverters inherited these alleles with none among 12 HIV-infected infants (Haldane OR = 12.3; P = 0.037). The previously recognized association of DR13 alleles with some situations of long-term nonprogression of HIV suggests that similar mechanisms may regulate both the occurrence of infection and disease progression after infection. Upon examining for residual associations, only only the DR2 allele DRB1*1501 was associated with seroreversion in Caucasoid infants (OR = 24; P = 0.004). Among Caucasoids the DRB1*03011 allele was positively associated with the occurrence of HIV infection (P = 0.03).
Resumo:
The bithorax complex (BX-C) of Drosophila, one of two complexes that act as master regulators of the body plan of the fly, is included within a sequence of 338,234 bp (SEQ89E). This paper presents the strategy used in sequencing SEQ89E and an analysis of its open reading frames. The BX-C sequence (BXCALL) contains 314,895 bp obtained by deletion of putative genes that are located at each end of SEQ89E and appear to be functionally unrelated to the BX-C. Only 1.4% of BXCALL codes for the three homeodomain-containing proteins of the complex. Principal findings include a putative ABD-A protein (ABD-AII) larger than a previously known ABD-A protein and a putative glucose transporter-like gene (1521 bp) located at or near the bithoraxoid (bxd), infra-abdominal-2 (iab-2) boundary on the opposite strand relative to that of the homeobox-containing genes.
Resumo:
The bithorax complex (BX-C) of Drosophila, one of two complexes that act as master regulators of the body plan of the fly, has now been entirely sequenced and comprises approximately 315,000 bp, only 1.4% of which codes for protein. Analysis of this sequence reveals significantly overrepresented DNA motifs of unknown, as well as known, functions in the non-protein-coding portion of the sequence. The following types of motifs in that portion are analyzed: (i) concatamers of mono-, di-, and trinucleotides; (ii) tightly clustered hexanucleotides (spaced < or = 5 bases apart); (iii) direct and reverse repeats longer than 20 bp; and (iv) a number of motifs known from biochemical studies to play a role in the regulation of the BX-C. The hexanucleotide AGATAC is remarkably overrepresented and is surmised to play a role in chromosome pairing. The positions of sites of highly overrepresented motifs are plotted for those that occur at more than five sites in the sequence, when < 0.5 case is expected. Expected values are based on a third-order Markov chain, which is the optimal order for representing the BXCALL sequence.
Resumo:
Funding Silvia S. Monteiro and Marisa Ferreira were supported by a Ph.D. grant from Fundação para a Ciência e Tecnologia (ref SFRH/BD/38735/2007 and SFRH/BD/30240/2006, respectively). Alfredo López was supported by a postdoctoral grant from Fundação para a Ciência e Tecnologia (ref SFRH/BPD/82407/2011). Catarina Eira is supported by CESAM (UID/AMB/50017), from FCT/MEC through national funds and FEDER (PT2020, Compete 2020). The work related with strandings and tissue collection in Portugal was partially supported by the SafeSea Project EEAGrants PT 0039 (supported by Iceland, Liechtenstein and Norway through the EEA Financial Mechanism), by the Project MarPro–Life09 NAT/PT/000038 (funded by the European Union–Program Life+) and by the project CetSenti FCT RECI/AAG-GLO/0470/2012; FCOMP-01-0124-FEDER-027472 (Funded by the Program COMPETE and Fundação para a Ciência e Tecnologia).
Resumo:
Smut fungi are important pathogens of grasses, including the cultivated crops maize, sorghum and sugarcane. Typically, smut fungi infect the inflorescence of their host plants. Three genera of smut fungi (Ustilago, Sporisorium and Macalpinomyces) form a complex with overlapping morphological characters, making species placement problematic. For example, the newly described Macalpinomyces mackinlayi possesses a combination of morphological characters such that it cannot be unambiguously accommodated in any of the three genera. Previous attempts to define Ustilago, Sporisorium and Macalpinomyces using morphology and molecular phylogenetics have highlighted the polyphyletic nature of the genera, but have failed to produce a satisfactory taxonomic resolution. A detailed systematic study of 137 smut species in the Ustilago-Sporisorium- Macalpinomyces complex was completed in the current work. Morphological and DNA sequence data from five loci were assessed with maximum likelihood and Bayesian inference to reconstruct a phylogeny of the complex. The phylogenetic hypotheses generated were used to identify morphological synapomorphies, some of which had previously been dismissed as a useful way to delimit the complex. These synapomorphic characters are the basis for a revised taxonomic classification of the Ustilago-Sporisorium-Macalpinomyces complex, which takes into account their morphological diversity and coevolution with their grass hosts. The new classification is based on a redescription of the type genus Sporisorium, and the establishment of four genera, described from newly recognised monophyletic groups, to accommodate species expelled from Sporisorium. Over 150 taxonomic combinations have been proposed as an outcome of this investigation, which makes a rigorous and objective contribution to the fungal systematics of these important plant pathogens.