965 resultados para Boolean Functions, Nonlinearity, Evolutionary Computation, Equivalence Classes
Resumo:
Una Red de Procesadores Evolutivos o NEP (por sus siglas en ingles), es un modelo computacional inspirado por el modelo evolutivo de las celulas, específicamente por las reglas de multiplicación de las mismas. Esta inspiración hace que el modelo sea una abstracción sintactica de la manipulation de information de las celulas. En particu¬lar, una NEP define una maquina de cómputo teorica capaz de resolver problemas NP completos de manera eficiente en tóerminos de tiempo. En la praóctica, se espera que las NEP simuladas en móaquinas computacionales convencionales puedan resolver prob¬lemas reales complejos (que requieran ser altamente escalables) a cambio de una alta complejidad espacial. En el modelo NEP, las cóelulas estóan representadas por palabras que codifican sus secuencias de ADN. Informalmente, en cualquier momento de cómputo del sistema, su estado evolutivo se describe como un coleccion de palabras, donde cada una de ellas representa una celula. Estos momentos fijos de evolucion se denominan configuraciones. De manera similar al modelo biologico, las palabras (celulas) mutan y se dividen en base a bio-operaciones sencillas, pero solo aquellas palabras aptas (como ocurre de forma parecida en proceso de selection natural) seran conservadas para la siguiente configuracióon. Una NEP como herramienta de computation, define una arquitectura paralela y distribuida de procesamiento simbolico, en otras palabras, una red de procesadores de lenguajes. Desde el momento en que el modelo fue propuesto a la comunidad científica en el año 2001, múltiples variantes se han desarrollado y sus propiedades respecto a la completitud computacional, eficiencia y universalidad han sido ampliamente estudiadas y demostradas. En la actualidad, por tanto, podemos considerar que el modelo teórico NEP se encuentra en el estadio de la madurez. La motivación principal de este Proyecto de Fin de Grado, es proponer una aproxi-mación práctica que permita dar un salto del modelo teórico NEP a una implantación real que permita su ejecucion en plataformas computacionales de alto rendimiento, con el fin de solucionar problemas complejos que demanda la sociedad actual. Hasta el momento, las herramientas desarrolladas para la simulation del modelo NEP, si bien correctas y con resultados satisfactorios, normalmente estón atadas a su entorno de ejecucion, ya sea el uso de hardware específico o implementaciones particulares de un problema. En este contexto, el propósito fundamental de este trabajo es el desarrollo de Nepfix, una herramienta generica y extensible para la ejecucion de cualquier algo¬ritmo de un modelo NEP (o alguna de sus variantes), ya sea de forma local, como una aplicación tradicional, o distribuida utilizando los servicios de la nube. Nepfix es una aplicacion software desarrollada durante 7 meses y que actualmente se encuentra en su segunda iteration, una vez abandonada la fase de prototipo. Nepfix ha sido disenada como una aplicacion modular escrita en Java 8 y autocontenida, es decir, no requiere de un entorno de ejecucion específico (cualquier maquina virtual de Java es un contenedor vólido). Nepfix contiene dos componentes o móodulos. El primer móodulo corresponde a la ejecución de una NEP y es por lo tanto, el simulador. Para su desarrollo, se ha tenido en cuenta el estado actual del modelo, es decir, las definiciones de los procesadores y filtros mas comunes que conforman la familia del modelo NEP. Adicionalmente, este componente ofrece flexibilidad en la ejecucion, pudiendo ampliar las capacidades del simulador sin modificar Nepfix, usando para ello un lenguaje de scripting. Dentro del desarrollo de este componente, tambióen se ha definido un estóandar de representacióon del modelo NEP basado en el formato JSON y se propone una forma de representation y codificación de las palabras, necesaria para la comunicación entre servidores. Adicional-mente, una característica importante de este componente, es que se puede considerar una aplicacion aislada y por tanto, la estrategia de distribution y ejecución son total-mente independientes. El segundo moódulo, corresponde a la distribucióon de Nepfix en la nube. Este de-sarrollo es el resultado de un proceso de i+D, que tiene una componente científica considerable. Vale la pena resaltar el desarrollo de este modulo no solo por los resul-tados prócticos esperados, sino por el proceso de investigation que se se debe abordar con esta nueva perspectiva para la ejecución de sistemas de computación natural. La principal característica de las aplicaciones que se ejecutan en la nube es que son gestionadas por la plataforma y normalmente se encapsulan en un contenedor. En el caso de Nepfix, este contenedor es una aplicacion Spring que utiliza el protocolo HTTP o AMQP para comunicarse con el resto de instancias. Como valor añadido, Nepfix aborda dos perspectivas de implementation distintas (que han sido desarrolladas en dos iteraciones diferentes) del modelo de distribution y ejecucion, que tienen un impacto muy significativo en las capacidades y restricciones del simulador. En concreto, la primera iteration utiliza un modelo de ejecucion asincrono. En esta perspectiva asincrona, los componentes de la red NEP (procesadores y filtros) son considerados como elementos reactivos a la necesidad de procesar una palabra. Esta implementation es una optimization de una topologia comun en el modelo NEP que permite utilizar herramientas de la nube para lograr un escalado transparente (en lo ref¬erente al balance de carga entre procesadores) pero produce efectos no deseados como indeterminacion en el orden de los resultados o imposibilidad de distribuir eficiente-mente redes fuertemente interconectadas. Por otro lado, la segunda iteration corresponde al modelo de ejecucion sincrono. Los elementos de una red NEP siguen un ciclo inicio-computo-sincronizacion hasta que el problema se ha resuelto. Esta perspectiva sincrona representa fielmente al modelo teórico NEP pero el proceso de sincronizacion es costoso y requiere de infraestructura adicional. En concreto, se requiere un servidor de colas de mensajes RabbitMQ. Sin embargo, en esta perspectiva los beneficios para problemas suficientemente grandes superan a los inconvenientes, ya que la distribuciín es inmediata (no hay restricciones), aunque el proceso de escalado no es trivial. En definitiva, el concepto de Nepfix como marco computacional se puede considerar satisfactorio: la tecnología es viable y los primeros resultados confirman que las carac-terísticas que se buscaban originalmente se han conseguido. Muchos frentes quedan abiertos para futuras investigaciones. En este documento se proponen algunas aproxi-maciones a la solucion de los problemas identificados como la recuperacion de errores y la division dinamica de una NEP en diferentes subdominios. Por otra parte, otros prob-lemas, lejos del alcance de este proyecto, quedan abiertos a un futuro desarrollo como por ejemplo, la estandarización de la representación de las palabras y optimizaciones en la ejecucion del modelo síncrono. Finalmente, algunos resultados preliminares de este Proyecto de Fin de Grado han sido presentados recientemente en formato de artículo científico en la "International Work-Conference on Artificial Neural Networks (IWANN)-2015" y publicados en "Ad-vances in Computational Intelligence" volumen 9094 de "Lecture Notes in Computer Science" de Springer International Publishing. Lo anterior, es una confirmation de que este trabajo mas que un Proyecto de Fin de Grado, es solo el inicio de un trabajo que puede tener mayor repercusion en la comunidad científica. Abstract Network of Evolutionary Processors -NEP is a computational model inspired by the evolution of cell populations, which might model some properties of evolving cell communities at the syntactical level. NEP defines theoretical computing devices able to solve NP complete problems in an efficient manner. In this model, cells are represented by words which encode their DNA sequences. Informally, at any moment of time, the evolutionary system is described by a collection of words, where each word represents one cell. Cells belong to species and their community evolves according to mutations and division which are defined by operations on words. Only those cells are accepted as surviving (correct) ones which are represented by a word in a given set of words, called the genotype space of the species. This feature is analogous with the natural process of evolution. Formally, NEP is based on an architecture for parallel and distributed processing, in other words, a network of language processors. Since the date when NEP was pro¬posed, several extensions and variants have appeared engendering a new set of models named Networks of Bio-inspired Processors (NBP). During this time, several works have proved the computational power of NBP. Specifically, their efficiency, universality, and computational completeness have been thoroughly investigated. Therefore, we can say that the NEP model has reached its maturity. The main motivation for this End of Grade project (EOG project in short) is to propose a practical approximation that allows to close the gap between theoretical NEP model and a practical implementation in high performing computational platforms in order to solve some of high the high complexity problems society requires today. Up until now tools developed to simulate NEPs, while correct and successful, are usu¬ally tightly coupled to the execution environment, using specific software frameworks (Hadoop) or direct hardware usage (GPUs). Within this context the main purpose of this work is the development of Nepfix, a generic and extensible tool that aims to execute algorithms based on NEP model and compatible variants in a local way, similar to a traditional application or in a distributed cloud environment. Nepfix as an application was developed during a 7 month cycle and is undergoing its second iteration once the prototype period was abandoned. Nepfix is designed as a modular self-contained application written in Java 8, that is, no additional external dependencies are required and it does not rely on an specific execution environment, any JVM is a valid container. Nepfix is made of two components or modules. The first module corresponds to the NEP execution and therefore simulation. During the development the current state of the theoretical model was used as a reference including most common filters and processors. Additionally extensibility is provided by the use of Python as a scripting language to run custom logic. Along with the simulation a definition language for NEP has been defined based on JSON as well as a mechanisms to represent words and their possible manipulations. NEP simulator is isolated from distribution and as mentioned before different applications that include it as a dependency are possible, the distribution of NEPs is an example of this. The second module corresponds to executing Nepfix in the cloud. The development carried a heavy R&D process since this front was not explored by other research groups until now. It's important to point out that the development of this module is not focused on results at this point in time, instead we focus on feasibility and discovery of this new perspective to execute natural computing systems and NEPs specifically. The main properties of cloud applications is that they are managed by the platform and are encapsulated in a container. For Nepfix a Spring application becomes the container and the HTTP or AMQP protocols are used for communication with the rest of the instances. Different execution perspectives were studied, namely asynchronous and synchronous models were developed for solving different kind of problems using NEPs. Different limitations and restrictions manifest in both models and are explored in detail in the respective chapters. In conclusion we can consider that Nepfix as a computational framework is suc-cessful: Cloud technology is ready for the challenge and the first results reassure that the properties Nepfix project pursued were met. Many investigation branches are left open for future investigations. In this EOG implementation guidelines are proposed for some of them like error recovery or dynamic NEP splitting. On the other hand other interesting problems that were not in the scope of this project were identified during development like word representation standardization or NEP model optimizations. As a confirmation that the results of this work can be useful to the scientific com-munity a preliminary version of this project was published in The International Work- Conference on Artificial Neural Networks (IWANN) in May 2015. Development has not stopped since that point and while Nepfix in it's current state can not be consid¬ered a final product the most relevant ideas, possible problems and solutions that were produced during the seven months development cycle are worthy to be gathered and presented giving a meaning to this EOG work.
Resumo:
La evolución de los teléfonos móviles inteligentes, dotados de cámaras digitales, está provocando una creciente demanda de aplicaciones cada vez más complejas que necesitan algoritmos de visión artificial en tiempo real; puesto que el tamaño de las señales de vídeo no hace sino aumentar y en cambio el rendimiento de los procesadores de un solo núcleo se ha estancado, los nuevos algoritmos que se diseñen para visión artificial han de ser paralelos para poder ejecutarse en múltiples procesadores y ser computacionalmente escalables. Una de las clases de procesadores más interesantes en la actualidad se encuentra en las tarjetas gráficas (GPU), que son dispositivos que ofrecen un alto grado de paralelismo, un excelente rendimiento numérico y una creciente versatilidad, lo que los hace interesantes para llevar a cabo computación científica. En esta tesis se exploran dos aplicaciones de visión artificial que revisten una gran complejidad computacional y no pueden ser ejecutadas en tiempo real empleando procesadores tradicionales. En cambio, como se demuestra en esta tesis, la paralelización de las distintas subtareas y su implementación sobre una GPU arrojan los resultados deseados de ejecución con tasas de refresco interactivas. Asimismo, se propone una técnica para la evaluación rápida de funciones de complejidad arbitraria especialmente indicada para su uso en una GPU. En primer lugar se estudia la aplicación de técnicas de síntesis de imágenes virtuales a partir de únicamente dos cámaras lejanas y no paralelas—en contraste con la configuración habitual en TV 3D de cámaras cercanas y paralelas—con información de color y profundidad. Empleando filtros de mediana modificados para la elaboración de un mapa de profundidad virtual y proyecciones inversas, se comprueba que estas técnicas son adecuadas para una libre elección del punto de vista. Además, se demuestra que la codificación de la información de profundidad con respecto a un sistema de referencia global es sumamente perjudicial y debería ser evitada. Por otro lado se propone un sistema de detección de objetos móviles basado en técnicas de estimación de densidad con funciones locales. Este tipo de técnicas es muy adecuada para el modelado de escenas complejas con fondos multimodales, pero ha recibido poco uso debido a su gran complejidad computacional. El sistema propuesto, implementado en tiempo real sobre una GPU, incluye propuestas para la estimación dinámica de los anchos de banda de las funciones locales, actualización selectiva del modelo de fondo, actualización de la posición de las muestras de referencia del modelo de primer plano empleando un filtro de partículas multirregión y selección automática de regiones de interés para reducir el coste computacional. Los resultados, evaluados sobre diversas bases de datos y comparados con otros algoritmos del estado del arte, demuestran la gran versatilidad y calidad de la propuesta. Finalmente se propone un método para la aproximación de funciones arbitrarias empleando funciones continuas lineales a tramos, especialmente indicada para su implementación en una GPU mediante el uso de las unidades de filtraje de texturas, normalmente no utilizadas para cómputo numérico. La propuesta incluye un riguroso análisis matemático del error cometido en la aproximación en función del número de muestras empleadas, así como un método para la obtención de una partición cuasióptima del dominio de la función para minimizar el error. ABSTRACT The evolution of smartphones, all equipped with digital cameras, is driving a growing demand for ever more complex applications that need to rely on real-time computer vision algorithms. However, video signals are only increasing in size, whereas the performance of single-core processors has somewhat stagnated in the past few years. Consequently, new computer vision algorithms will need to be parallel to run on multiple processors and be computationally scalable. One of the most promising classes of processors nowadays can be found in graphics processing units (GPU). These are devices offering a high parallelism degree, excellent numerical performance and increasing versatility, which makes them interesting to run scientific computations. In this thesis, we explore two computer vision applications with a high computational complexity that precludes them from running in real time on traditional uniprocessors. However, we show that by parallelizing subtasks and implementing them on a GPU, both applications attain their goals of running at interactive frame rates. In addition, we propose a technique for fast evaluation of arbitrarily complex functions, specially designed for GPU implementation. First, we explore the application of depth-image–based rendering techniques to the unusual configuration of two convergent, wide baseline cameras, in contrast to the usual configuration used in 3D TV, which are narrow baseline, parallel cameras. By using a backward mapping approach with a depth inpainting scheme based on median filters, we show that these techniques are adequate for free viewpoint video applications. In addition, we show that referring depth information to a global reference system is ill-advised and should be avoided. Then, we propose a background subtraction system based on kernel density estimation techniques. These techniques are very adequate for modelling complex scenes featuring multimodal backgrounds, but have not been so popular due to their huge computational and memory complexity. The proposed system, implemented in real time on a GPU, features novel proposals for dynamic kernel bandwidth estimation for the background model, selective update of the background model, update of the position of reference samples of the foreground model using a multi-region particle filter, and automatic selection of regions of interest to reduce computational cost. The results, evaluated on several databases and compared to other state-of-the-art algorithms, demonstrate the high quality and versatility of our proposal. Finally, we propose a general method for the approximation of arbitrarily complex functions using continuous piecewise linear functions, specially formulated for GPU implementation by leveraging their texture filtering units, normally unused for numerical computation. Our proposal features a rigorous mathematical analysis of the approximation error in function of the number of samples, as well as a method to obtain a suboptimal partition of the domain of the function to minimize approximation error.
Resumo:
Wave energy conversion has an essential difference from other renewable energies since the dependence between the devices design and the energy resource is stronger. Dimensioning is therefore considered a key stage when a design project of Wave Energy Converters (WEC) is undertaken. Location, WEC concept, Power Take-Off (PTO) type, control strategy and hydrodynamic resonance considerations are some of the critical aspects to take into account to achieve a good performance. The paper proposes an automatic dimensioning methodology to be accomplished at the initial design project stages and the following elements are described to carry out the study: an optimization design algorithm, its objective functions and restrictions, a PTO model, as well as a procedure to evaluate the WEC energy production. After that, a parametric analysis is included considering different combinations of the key parameters previously introduced. A variety of study cases are analysed from the point of view of energy production for different design-parameters and all of them are compared with a reference case. Finally, a discussion is presented based on the results obtained, and some recommendations to face the WEC design stage are given.
Resumo:
The Ising problem consists in finding the analytical solution of the partition function of a lattice once the interaction geometry among its elements is specified. No general analytical solution is available for this problem, except for the one-dimensional case. Using site-specific thermodynamics, it is shown that the partition function for ligand binding to a two-dimensional lattice can be obtained from those of one-dimensional lattices with known solution. The complexity of the lattice is reduced recursively by application of a contact transformation that involves a relatively small number of steps. The transformation implemented in a computer code solves the partition function of the lattice by operating on the connectivity matrix of the graph associated with it. This provides a powerful new approach to the Ising problem, and enables a systematic analysis of two-dimensional lattices that model many biologically relevant phenomena. Application of this approach to finite two-dimensional lattices with positive cooperativity indicates that the binding capacity per site diverges as Na (N = number of sites in the lattice) and experiences a phase-transition-like discontinuity in the thermodynamic limit N → ∞. The zeroes of the partition function tend to distribute on a slightly distorted unit circle in complex plane and approach the positive real axis already for a 5×5 square lattice. When the lattice has negative cooperativity, its properties mimic those of a system composed of two classes of independent sites with the apparent population of low-affinity binding sites increasing with the size of the lattice, thereby accounting for a phenomenon encountered in many ligand-receptor interactions.
Evolutionary analyses of hedgehog and Hoxd-10 genes in fish species closely related to the zebrafish
Resumo:
The study of development has relied primarily on the isolation of mutations in genes with specific functions in development and on the comparison of their expression patterns in normal and mutant phenotypes. Comparative evolutionary analyses can complement these approaches. Phylogenetic analyses of Sonic hedgehog (Shh) and Hoxd-10 genes from 18 cyprinid fish species closely related to the zebrafish provide novel insights into the functional constraints acting on Shh. Our results confirm and extend those gained from expression and crystalline structure analyses of this gene. Unexpectedly, exon 1 of Shh is found to be almost invariant even in third codon positions among these morphologically divergent species suggesting that this exon encodes for a functionally important domain of the hedgehog protein. This is surprising because the main functional domain of Shh had been thought to be that encoded by exon 2. Comparisons of Shh and Hoxd-10 gene sequences and of resulting gene trees document higher evolutionary constraints on the former than on the latter. This might be indicative of more general evolutionary patterns in networks of developmental regulatory genes interacting in a hierarchical fashion. The presence of four members of the hedgehog gene family in cyprinid fishes was documented and their homologies to known hedgehog genes in other vertebrates were established.
Resumo:
The rat mitochondrial outer membrane-localized benzodiazepine receptor (MBR) was expressed in wild-type and TspO− (tryptophan-rich sensory protein) strains of the facultative photoheterotroph, Rhodobacter sphaeroides 2.4.1, and was shown to retain its structure within the bacterial outer membrane as assayed by its binding properties with a variety of MBR ligands. Functionally, it was able to substitute for TspO by negatively regulating the expression of photosynthesis genes in response to oxygen. This effect was reversed pharmacologically with the MBR ligand PK11195. These results suggest a close evolutionary and functional relationship between the bacterial TspO and the MBR. This relationship provides further support for the origin of the mammalian mitochondrion from a “photosynthetic” precursor. Finally, these findings provide novel insights into the physiological role that has been obscure for the MBR in situ.
Resumo:
Selected aspects of the evolutionary process and more specifically of the genetic variation are considered, with an emphasis in studies performed by my group. One key aspect of evolution seems to be the concomitant occurrence of dichotomic, contradictory (dialect) processes. Genetic variation is structured, and the dynamics of change at one level is not necessarily paralleled by that in another. The pathogenesis-related protein superfamily can be cited as an example in which permanence (the maintenance of certain key genetic features) coexists with change (modifications that led to different functions in different classes of organisms). Relationships between structure and function are exemplified by studies with hemoglobin Porto Alegre. The genetic structure of tribal populations may differ in important aspects from that of industrialized societies. Evolutionary histories also may differ when considered through the investigation of patrilineal or matrilineal lineages. Global evaluations taking into consideration all of these aspects are needed if we really want to understand the meaning of genetic variation.
Resumo:
We created a simulation based on experimental data from bacteriophage T7 that computes the developmental cycle of the wild-type phage and also of mutants that have an altered genome order. We used the simulation to compute the fitness of more than 105 mutants. We tested these computations by constructing and experimentally characterizing T7 mutants in which we repositioned gene 1, coding for T7 RNA polymerase. Computed protein synthesis rates for ectopic gene 1 strains were in moderate agreement with observed rates. Computed phage-doubling rates were close to observations for two of four strains, but significantly overestimated those of the other two. Computations indicate that the genome organization of wild-type T7 is nearly optimal for growth: only 2.8% of random genome permutations were computed to grow faster, the highest 31% faster, than wild type. Specific discrepancies between computations and observations suggest that a better understanding of the translation efficiency of individual mRNAs and the functions of qualitatively “nonessential” genes will be needed to improve the T7 simulation. In silico representations of biological systems can serve to assess and advance our understanding of the underlying biology. Iteration between computation, prediction, and observation should increase the rate at which biological hypotheses are formulated and tested.
Resumo:
We cloned a new inhibitor of apoptosis protein (IAP) homolog, SfIAP, from Spodoptera frugiperda Sf-21 cells, a host of insect baculoviruses. SfIAP contains two baculovirus IAP repeat domains followed by a RING domain. SfIAP has striking amino acid sequence similarity with baculoviral IAPs, CpIAP and OpIAP, suggesting that baculoviral IAPs may be host-derived genes. SfIAP and baculoviral CpIAP inhibit Bax but not Fas-induced apoptosis in human cells. Their apoptosis-suppressing activity in mammalian cells requires both baculovirus IAP repeat and RING domains. Further biochemical data suggest that SfIAP and CpIAP are specific inhibitors of mammalian caspase-9, the pinnacle caspase in the mitochondria/cytochrome c pathway for apoptosis, but are not inhibitors of downstream caspase-3 and caspase-7. Thus the mechanisms by which insect and baculoviral IAPs suppress apoptosis may involve inhibition of an insect caspase-9 homologue. Peptides representing the IAP-binding domain of the Drosophila cell death protein Grim abrogated human caspase suppression by SfIAP and CpIAP, implying evolutionary conservation of the functions of IAPs and their inhibitors.
Resumo:
Cryptocyanin, a copper-free hexameric protein in crab (Cancer magister) hemolymph, has been characterized and the amino acid sequence has been deduced from its cDNA. It is markedly similar in sequence, size, and structure to hemocyanin, the copper-containing oxygen-transport protein found in many arthropods. Cryptocyanin does not bind oxygen, however, and lacks three of the six highly conserved copper-binding histidine residues of hemocyanin. Cryptocyanin has no phenoloxidase activity, although a phenoloxidase is present in the hemolymph. The concentration of cryptocyanin in the hemolymph is closely coordinated with the molt cycle and reaches levels higher than hemocyanin during premolt. Cryptocyanin resembles insect hexamerins in the lack of copper, molt cycle patterns of biosynthesis, and potential contributions to the new exoskeleton. Phylogenetic analysis of sequence similarities between cryptocyanin and other members of the hemocyanin gene family shows that cryptocyanin is closely associated with crustacean hemocyanins and suggests that cryptocyanin arose as a result of a hemocyanin gene duplication. The presence of both hemocyanin and cryptocyanin in one animal provides an example of how insect hexamerins might have evolved from hemocyanin. Our results suggest that multiple members of the hemocyanin gene family—hemocyanin, cryptocyanin, phenoloxidase, and hexamerins—may participate in two vital functions of molting animals, oxygen binding and molting. Cryptocyanin may provide important molecular data to further investigate evolutionary relationships among all molting animals.
Resumo:
Pairwise sequence comparison methods have been assessed using proteins whose relationships are known reliably from their structures and functions, as described in the scop database [Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia C. (1995) J. Mol. Biol. 247, 536–540]. The evaluation tested the programs blast [Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990). J. Mol. Biol. 215, 403–410], wu-blast2 [Altschul, S. F. & Gish, W. (1996) Methods Enzymol. 266, 460–480], fasta [Pearson, W. R. & Lipman, D. J. (1988) Proc. Natl. Acad. Sci. USA 85, 2444–2448], and ssearch [Smith, T. F. & Waterman, M. S. (1981) J. Mol. Biol. 147, 195–197] and their scoring schemes. The error rate of all algorithms is greatly reduced by using statistical scores to evaluate matches rather than percentage identity or raw scores. The E-value statistical scores of ssearch and fasta are reliable: the number of false positives found in our tests agrees well with the scores reported. However, the P-values reported by blast and wu-blast2 exaggerate significance by orders of magnitude. ssearch, fasta ktup = 1, and wu-blast2 perform best, and they are capable of detecting almost all relationships between proteins whose sequence identities are >30%. For more distantly related proteins, they do much less well; only one-half of the relationships between proteins with 20–30% identity are found. Because many homologs have low sequence similarity, most distant relationships cannot be detected by any pairwise comparison method; however, those which are identified may be used with confidence.
Resumo:
Analyses of complete genomes indicate that a massive prokaryotic gene transfer (or transfers) preceded the formation of the eukaryotic cell. In comparisons of the entire set of Methanococcus jannaschii genes with their orthologs from Escherichia coli, Synechocystis 6803, and the yeast Saccharomyces cerevisiae, it is shown that prokaryotic genomes consist of two different groups of genes. The deeper, diverging informational lineage codes for genes which function in translation, transcription, and replication, and also includes GTPases, vacuolar ATPase homologs, and most tRNA synthetases. The more recently diverging operational lineage codes for amino acid synthesis, the biosynthesis of cofactors, the cell envelope, energy metabolism, intermediary metabolism, fatty acid and phospholipid biosynthesis, nucleotide biosynthesis, and regulatory functions. In eukaryotes, the informational genes are most closely related to those of Methanococcus, whereas the majority of operational genes are most closely related to those of Escherichia, but some are closest to Methanococcus or to Synechocystis.
Resumo:
We have shown previously by Southern blot analysis that Bov-B long interspersed nuclear elements (LINEs) are present in different Viperidae snake species. To address the question as to whether Bov-B LINEs really have been transmitted horizontally between vertebrate classes, the analysis has been extended to a larger number of vertebrate, invertebrate, and plant species. In this paper, the evolutionary origin of Bov-B LINEs is shown unequivocally to be in Squamata. The previously proposed horizontal transfer of Bov-B LINEs in vertebrates has been confirmed by their discontinuous phylogenetic distribution in Squamata (Serpentes and two lizard infra-orders) as well as in Ruminantia, by the high level of nucleotide identity, and by their phylogenetic relationships. The horizontal transfer of Bov-B LINEs from Squamata to the ancestor of Ruminantia is evident from the genetic distances and discontinuous phylogenetic distribution. The ancestor of Colubroidea snakes is a possible donor of Bov-B LINEs to Ruminantia. The timing of horizontal transfer has been estimated from the distribution of Bov-B LINEs in Ruminantia and the fossil data of Ruminantia to be 40–50 My ago. The phylogenetic relationships of Bov-B LINEs from the various Squamata species agrees with that of the species phylogeny, suggesting that Bov-B LINEs have been maintained stably by vertical transmission since the origin of Squamata in the Mesozoic era.
Resumo:
The proper development of digits, in tetrapods, requires the activity of several genes of the HoxA and HoxD homeobox gene complexes. By using a variety of loss-of-function alleles involving the five Hox genes that have been described to affect digit patterning, we report here that the group 11, 12, and 13 genes control both the size and number of murine digits in a dose-dependent fashion, rather than through a Hox code involving differential qualitative functions. A similar dose–response is observed in the morphogenesis of the penian bone, the baculum, which further suggests that digits and external genitalia share this genetic control mechanism. A progressive reduction in the dose of Hox gene products led first to ectrodactyly, then to olygodactyly and adactyly. Interestingly, this transition between the pentadactyl to the adactyl formula went through a step of polydactyly. We propose that in the distal appendage of polydactylous short-digited ancestral tetrapods, such as Acanthostega, the HoxA complex was predominantly active. Subsequent recruitment of the HoxD complex contributed to both reductions in digit number and increase in digit length. Thus, transition through a polydactylous limb before reaching and stabilizing the pentadactyl pattern may have relied, at least in part, on asynchronous and independent changes in the regulation of HoxA and HoxD gene complexes.
Resumo:
The Lec35 gene product (Lec35p) is required for utilization of the mannose donor mannose-P-dolichol (MPD) in synthesis of both lipid-linked oligosaccharides (LLOs) and glycosylphosphatidylinositols, which are important for functions such as protein folding and membrane anchoring, respectively. The hamster Lec35 gene is shown to encode the previously identified cDNA SL15, which corrects the Lec35 mutant phenotype and predicts a novel endoplasmic reticulum membrane protein. The mutant hamster alleles Lec35.1 and Lec35.2 are characterized, and the human Lec35 gene (mannose-P-dolichol utilization defect 1) was mapped to 17p12-13. To determine whether Lec35p was required only for MPD-dependent mannosylation of LLO and glycosylphosphatidylinositol intermediates, two additional lipid-mediated reactions were investigated: MPD-dependent C-mannosylation of tryptophanyl residues, and glucose-P-dolichol (GPD)-dependent glucosylation of LLO. Both were found to require Lec35p. In addition, the SL15-encoded protein was selective for MPD compared with GPD, suggesting that an additional GPD-selective Lec35 gene product remains to be identified. The predicted amino acid sequence of Lec35p does not suggest an obvious function or mechanism. By testing the water-soluble MPD analog mannose-β-1-P-citronellol in an in vitro system in which the MPD utilization defect was preserved by permeabilization with streptolysin-O, it was determined that Lec35p is not directly required for the enzymatic transfer of mannose from the donor to the acceptor substrate. These results show that Lec35p has an essential role for all known classes of monosaccharide-P-dolichol-dependent reactions in mammals. The in vitro data suggest that Lec35p controls an aspect of MPD orientation in the endoplasmic reticulum membrane that is crucial for its activity as a donor substrate.