910 resultados para Sequence sorting
Resumo:
L’explosion du nombre de séquences permet à la phylogénomique, c’est-à-dire l’étude des liens de parenté entre espèces à partir de grands alignements multi-gènes, de prendre son essor. C’est incontestablement un moyen de pallier aux erreurs stochastiques des phylogénies simple gène, mais de nombreux problèmes demeurent malgré les progrès réalisés dans la modélisation du processus évolutif. Dans cette thèse, nous nous attachons à caractériser certains aspects du mauvais ajustement du modèle aux données, et à étudier leur impact sur l’exactitude de l’inférence. Contrairement à l’hétérotachie, la variation au cours du temps du processus de substitution en acides aminés a reçu peu d’attention jusqu’alors. Non seulement nous montrons que cette hétérogénéité est largement répandue chez les animaux, mais aussi que son existence peut nuire à la qualité de l’inférence phylogénomique. Ainsi en l’absence d’un modèle adéquat, la suppression des colonnes hétérogènes, mal gérées par le modèle, peut faire disparaître un artéfact de reconstruction. Dans un cadre phylogénomique, les techniques de séquençage utilisées impliquent souvent que tous les gènes ne sont pas présents pour toutes les espèces. La controverse sur l’impact de la quantité de cellules vides a récemment été réactualisée, mais la majorité des études sur les données manquantes sont faites sur de petits jeux de séquences simulées. Nous nous sommes donc intéressés à quantifier cet impact dans le cas d’un large alignement de données réelles. Pour un taux raisonnable de données manquantes, il appert que l’incomplétude de l’alignement affecte moins l’exactitude de l’inférence que le choix du modèle. Au contraire, l’ajout d’une séquence incomplète mais qui casse une longue branche peut restaurer, au moins partiellement, une phylogénie erronée. Comme les violations de modèle constituent toujours la limitation majeure dans l’exactitude de l’inférence phylogénétique, l’amélioration de l’échantillonnage des espèces et des gènes reste une alternative utile en l’absence d’un modèle adéquat. Nous avons donc développé un logiciel de sélection de séquences qui construit des jeux de données reproductibles, en se basant sur la quantité de données présentes, la vitesse d’évolution et les biais de composition. Lors de cette étude nous avons montré que l’expertise humaine apporte pour l’instant encore un savoir incontournable. Les différentes analyses réalisées pour cette thèse concluent à l’importance primordiale du modèle évolutif.
Resumo:
Several mutations that cause severe forms of the human disease autosomal dominant retinitis pigmentosa cluster in the C-terminal region of rhodopsin. Recent studies have implicated the C-terminal domain of rhodopsin in its trafficking on specialized post-Golgi membranes to the rod outer segment of the photoreceptor cell. Here we used synthetic peptides as competitive inhibitors of rhodopsin trafficking in the frog retinal cell-free system to delineate the potential regulatory sequence within the C terminus of rhodopsin and model the effects of severe retinitis pigmentosa alleles on rhodopsin sorting. The rhodopsin C-terminal sequence QVS(A)PA is highly conserved among different species. Peptides that correspond to the C terminus of bovine (amino acids 324–348) and frog (amino acids 330–354) rhodopsin inhibited post-Golgi trafficking by 50% and 60%, respectively, and arrested newly synthesized rhodopsin in the trans-Golgi network. Peptides corresponding to the cytoplasmic loops of rhodopsin and other control peptides had no effect. When three naturally occurring mutations: Q344ter (lacking the last five amino acids QVAPA), V345M, and P347S were introduced into the frog C-terminal peptide, the inhibitory activity of the peptides was no longer detectable. These observations suggest that the amino acids QVS(A)PA comprise a signal that is recognized by specific factors in the trans-Golgi network. A lack of recognition of this sequence, because of mutations in the last five amino acids causing autosomal dominant retinitis pigmentosa, most likely results in abnormal post-Golgi membrane formation and in an aberrant subcellular localization of rhodopsin.
Resumo:
Seven hundred and nineteen samples from throughout the Cainozoic section in CRP-3 were analysed by a Malvern Mastersizer laser particle analyser, in order to derive a stratigraphic distribution of grain-size parameters downhole. Entropy analysis of these data (using the method of Woolfe and Michibayashi, 1995) allowed recognition of four groups of samples, each group characterised by a distinctive grain-size distribution. Group 1, which shows a multi-modal distribution, corresponds to mudrocks, interbedded mudrock/sandstone facies, muddy sandstones and diamictites. Group 2, with a sand-grade mode but showing wide dispersion of particle size, corresponds to muddy sandstones, a few cleaner sandstones and some conglomerates. Group 3 and Group 4 are also sand-dominated, with better grain-size sorting, and correspond to clean, well-washed sandstones of varying mean grain-size (medium and fine modes, respectively). The downhole disappearance of Group 1, and dominance of Groups 3 and 4 reflect a concomitant change from mudrock- and diamictite-rich lithology to a section dominated by clean, well-washed sandstones with minor conglomerates. Progressive downhole increases in percentage sand and principal mode also reflect these changes. Significant shifts in grain-size parameters and entropy group membership were noted across sequence boundaries and seismic reflectors, as recognised in others studies.
Resumo:
Microtubule plus-end-tracking proteins (+TIPs) specifically localize to the growing plus-ends of microtubules to regulate microtubule dynamics and functions. A large group of +TIPs contain a short linear motif, SXIP, which is essential for them to bind to end-binding proteins (EBs) and target microtubule ends. The SXIP sequence site thus acts as a widespread microtubule tip localization signal (MtLS). Here we have analyzed the sequence-function relationship of a canonical MtLS. Using synthetic peptide arrays on membrane supports, we identified the residue preferences at each amino acid position of the SXIP motif and its surrounding sequence with respect to EB binding. We further developed an assay based on fluorescence polarization to assess the mechanism of the EB-SXIP interaction and to correlate EB binding and microtubule tip tracking of MtLS sequences from different +TIPs. Finally, we investigated the role of phosphorylation in regulating the EB-SXIP interaction. Together, our results define the sequence determinants of a canonical MtLS and provide the experimental data for bioinformatics approaches to carry out genome-wide predictions of novel +TIPs in multiple organisms.
Resumo:
We explicitly tested for the first time the ‘environmental specificity’ of traditional 16S rRNAtargeted fluorescence in situ hybridization (FISH) through comparison of the bacterial diversity actually targeted in the environment with the diversity that should be exactly targeted (i.e. without mismatches) according to in silico analysis. To do this, we exploited advances in modern Flow Cytometry that enabled improved detection and therefore sorting of sub-micron-sized particles and used probe PSE1284 (designed to target Pseudomonads) applied to Lolium perenne rhizosphere soil as our test system. The 6-carboxyfluorescein (6-FAM)-PSE1284-hybridised population, defined as displaying enhanced green fluorescence in Flow Cytometry, represented 3.51±1.28% of the total detected population when corrected using a nonsense (NON-EUB338) probe control. Analysis of 16S rRNA gene libraries constructed from Fluorescence Activated Cell Sorted (FACS) -recovered fluorescent populations (n=3), revealed that 98.5% (Pseudomonas spp. comprised 68.7% and Burkholderia spp. 29.8%) of the total sorted population was specifically targeted as evidenced by the homology of the 16S rRNA sequences to the probe sequence. In silico evaluation of probe PSE1284 with the use of RDP-10 probeMatch justified the existence of Burkholderia spp. among the sorted cells. The lack of novelty in Pseudomonas spp. sequences uncovered was notable, probably reflecting the well-studied nature of this functionally important genus. To judge the diversity recorded within the FACS-sorted population, rarefaction and DGGE analysis were used to evaluate, respectively, the proportion of Pseudomonas diversity uncovered by the sequencing effort and the representativeness of the Nycodenz® method for the extraction of bacterial cells from soil.
Resumo:
This paper describes a fast integer sorting algorithm, herein referred as Bit-index sort, which is a non-comparison sorting algorithm for partial per-mutations, with linear complexity order in execution time. Bit-index sort uses a bit-array to classify input sequences of distinct integers, and exploits built-in bit functions in C compilers supported by machine hardware to retrieve the ordered output sequence. Results show that Bit-index sort outperforms in execution time to quicksort and counting sort algorithms. A parallel approach for Bit-index sort using two simultaneous threads is included, which obtains speedups up to 1.6.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
A permutation is said to avoid a pattern if it does not contain any subsequence which is order-isomorphic to it. Donald Knuth, in the first volume of his celebrated book "The art of Computer Programming", observed that the permutations that can be computed (or, equivalently, sorted) by some particular data structures can be characterized in terms of pattern avoidance. In more recent years, the topic was reopened several times, while often in terms of sortable permutations rather than computable ones. The idea to sort permutations by using one of Knuth’s devices suggests to look for a deterministic procedure that decides, in linear time, if there exists a sequence of operations which is able to convert a given permutation into the identical one. In this thesis we show that, for the stack and the restricted deques, there exists an unique way to implement such a procedure. Moreover, we use these sorting procedures to create new sorting algorithms, and we prove some unexpected commutation properties between these procedures and the base step of bubblesort. We also show that the permutations that can be sorted by a combination of the base steps of bubblesort and its dual can be expressed, once again, in terms of pattern avoidance. In the final chapter we give an alternative proof of some enumerative results, in particular for the classes of permutations that can be sorted by the two restricted deques. It is well-known that the permutations that can be sorted through a restricted deque are counted by the Schrӧder numbers. In the thesis, we show how the deterministic sorting procedures yield a bijection between sortable permutations and Schrӧder paths.
Resumo:
The Schwalbenberg II loess-paleosol sequence (LPS) denotes a key site for Marine Isotope Stage (MIS 3) in Western Europe owing to eight succeeding cambisols, which primarily constitute the Ahrgau Subformation. Therefore, this LPS qualifies as a test candidate for the potential of temporal high-resolution geochemical data obtained X-ray fluorescence (XRF) scanning of discrete samplesproviding a fast and non-destructive tool for determining the element composition. The geochemical data is first contextualized to existing proxy data such as magnetic susceptibility (MS) and organic carbon (Corg) and then aggregated to element log ratios characteristic for weathering intensity [LOG (Ca/Sr), LOG (Rb/Sr), LOG (Ba/Sr), LOG (Rb/K)] and dust provenance [LOG (Ti/Zr), LOG (Ti/Al), LOG (Si/Al)]. Generally, an interpretation of rock magnetic particles is challenged in western Europe, where not only magnetic enhancement but also depletion plays a role. Our data indicates leaching and top-soil erosion induced MS depletion at the Schwalbenberg II LPS. Besides weathering, LOG (Ca/Sr) is susceptible for secondary calcification. Thus, also LOG (Rb/Sr) and LOG (Ba/Sr) are shown to be influenced by calcification dynamics. Consequently, LOG (Rb/K) seems to be the most suitable weathering index identifying the Sinzig Soils S1 and S2 as the most pronounced paleosols for this site. Sinzig Soil S3 is enclosed by gelic gleysols and in contrast to S1 and S2 only initially weathered pointing to colder climate conditions. Also the Remagen Soils are characterized by subtle to moderate positive excursions in the weathering indices. Comparing the Schwalbenberg II LPS with the nearby Eifel Lake Sediment Archive (ELSA) and other more distant German, Austrian and Czech LPS while discussing time and climate as limiting factors for pedogenesis, we suggest that the lithologically determined paleosols are in-situ soil formations. The provenance indices document a Zr-enrichment at the transition from the Ahrgau to the Hesbaye Subformation. This is explained by a conceptual model incorporating multiple sediment recycling and sorting effects in eolian and fluvial domains.
Resumo:
Applications that operate on meshes are very popular in High Performance Computing (HPC) environments. In the past, many techniques have been developed in order to optimize the memory accesses for these datasets. Different loop transformations and domain decompositions are com- monly used for structured meshes. However, unstructured grids are more challenging. The memory accesses, based on the mesh connectivity, do not map well to the usual lin- ear memory model. This work presents a method to improve the memory performance which is suitable for HPC codes that operate on meshes. We develop a method to adjust the sequence in which the data are used inside the algorithm, by means of traversing and sorting the mesh. This sorted mesh can be transferred sequentially to the lower memory levels and allows for minimum data transfer requirements. The method also reduces the lower memory requirements dra- matically: up to 63% of the L1 cache misses are removed in a traditional cache system. We have obtained speedups of up to 2.58 on memory operations as measured in a general- purpose CPU. An improvement is also observed with se- quential access memories, where we have observed reduc- tions of up to 99% in the required low-level memory size.
Resumo:
The 67-amino acid cytoplasmic tail of the cation-dependent mannose 6-phosphate receptor (CD-MPR) contains a signal(s) that prevents the receptor from entering lysosomes where it would be degraded. To identify the key residues required for proper endosomal sorting, we analyzed the intracellular distribution of mutant forms of the receptor by Percoll density gradients. A receptor with a Trp19 → Ala substitution in the cytoplasmic tail was highly missorted to lysosomes whereas receptors with either Phe18 → Ala or Phe13 → Ala mutations were partially defective in avoiding transport to lysosomes. Analysis of double and triple mutants confirmed the key role of Trp19 for sorting of the CD-MPR in endosomes, with Phe18, Phe13, and several neighboring residues contributing to this function. The addition of the Phe18-Trp19 motif of the CD-MPR to the cytoplasmic tail of the lysosomal membrane protein Lamp1 was sufficient to partially impair its delivery to lysosomes. Replacing Phe18 and Trp19 with other aromatic amino acids did not impair endosomal sorting of the CD-MPR, indicating that two aromatic residues located at these positions are sufficient to prevent the receptor from trafficking to lysosomes. However, alterations in the spacing of the diaromatic amino acid sequence relative to the transmembrane domain resulted in receptor accumulation in lysosomes. These findings indicate that the endosomal sorting of the CD-MPR depends on the correct presentation of a diaromatic amino acid-containing motif in its cytoplasmic tail. Because a diaromatic amino acid sequence is also present in the cytoplasmic tail of other receptors known to be internalized from the plasma membrane, this feature may prove to be a general determinant for endosomal sorting.
Resumo:
We previously identified the 11 amino acid C1 region of the cytoplasmic domain of P-selectin as essential for an endosomal sorting event that confers rapid turnover on P-selectin. The amino acid sequence of this region has no obvious similarity to other known sorting motifs. We have analyzed the sequence requirements for endosomal sorting by measuring the effects of site-specific mutations on the turnover of P-selectin and of the chimeric protein LLP, containing the lumenal and transmembrane domains of the low density lipoprotein receptor and the cytoplasmic domain of P-selectin. Endosomal sorting activity was remarkably tolerant of alanine substitutions within the C1 region. The activity was eliminated by alanine substitution of only one amino acid residue, leucine 768, where substitution with several other large side chains, hydrophobic and polar, maintained the sorting activity. The results indicate that the endosomal sorting determinant is not structurally related to previously reported sorting determinants. Rather, the results suggest that the structure of the sorting determinant is dependent on the tertiary structure of the cytoplasmic domain.
Resumo:
The muscle isoform. of clathrin heavy chain, CHC22, has 85% sequence identity to the ubiquitously expressed CHC17, yet its expression pattern and function appear to be distinct from those of well-characterized clathrin-coated vesicles. In mature muscle CHC22 is preferentially concentrated at neuromuscular and myotendinous junctions, suggesting a role at sarcolemmal contacts with extracellular matrix. During myoblast differentiation, CHC22 expression is increased, initially localized with desmin and nestin and then preferentially segregated to the poles of fused myoblasts. CHC22 expression is also increased in regenerating muscle fibers with the same time course as embryonic myosin, indicating a role in muscle repair. CHC22 binds to sorting nexin 5 through a coiled-coil domain present in both partners, which is absent in CHC17 and coincides with the region on CHC17 that binds the regulatory light-chain subunit. These differential binding data suggest a mechanism for the distinct functions of CHC22 relative to CHC17 in membrane traffic during muscle development, repair, and at neuromuscular and myotendinous junctions.