951 resultados para sequence based alignments
Resumo:
The backdrop of actual problematic about the implementation of Information Technology (IT) services management in Small and Medium Enterprises (SMEs) will be described. It will be exposed the reasons why reaching a maturity/capability level through well-known standards or the implementation of good software engineering practices by means of IT infrastructure Library are really difficult to achieve by SMEs. Also, the solutions to the exposed problems will be explained. Also master thesis goals are presented in terms of: purpose, research questions, research goals, objectives and scope. Finally, thesis structure is described.
Resumo:
Objective: The description and evaluation of the performance of a new real-time seizure detection algorithm in the newborn infant. Methods: The algorithm includes parallel fragmentation of EEG signal into waves; wave-feature extraction and averaging; elementary, preliminary and final detection. The algorithm detects EEG waves with heightened regularity, using wave intervals, amplitudes and shapes. The performance of the algorithm was assessed with the use of event-based and liberal and conservative time-based approaches and compared with the performance of Gotman's and Liu's algorithms. Results: The algorithm was assessed on multi-channel EEG records of 55 neonates including 17 with seizures. The algorithm showed sensitivities ranging 83-95% with positive predictive values (PPV) 48-77%. There were 2.0 false positive detections per hour. In comparison, Gotman's algorithm (with 30 s gap-closing procedure) displayed sensitivities of 45-88% and PPV 29-56%; with 7.4 false positives per hour and Liu's algorithm displayed sensitivities of 96-99%, and PPV 10-25%; with 15.7 false positives per hour. Conclusions: The wave-sequence analysis based algorithm displayed higher sensitivity, higher PPV and a substantially lower level of false positives than two previously published algorithms. Significance: The proposed algorithm provides a basis for major improvements in neonatal seizure detection and monitoring. Published by Elsevier Ireland Ltd. on behalf of International Federation of Clinical Neurophysiology.
Resumo:
We have developed an alignment-free method that calculates phylogenetic distances using a maximum-likelihood approach for a model of sequence change on patterns that are discovered in unaligned sequences. To evaluate the phylogenetic accuracy of our method, and to conduct a comprehensive comparison of existing alignment-free methods (freely available as Python package decaf+py at http://www.bioinformatics.org.au), we have created a data set of reference trees covering a wide range of phylogenetic distances. Amino acid sequences were evolved along the trees and input to the tested methods; from their calculated distances we infered trees whose topologies we compared to the reference trees. We find our pattern-based method statistically superior to all other tested alignment-free methods. We also demonstrate the general advantage of alignment-free methods over an approach based on automated alignments when sequences violate the assumption of collinearity. Similarly, we compare methods on empirical data from an existing alignment benchmark set that we used to derive reference distances and trees. Our pattern-based approach yields distances that show a linear relationship to reference distances over a substantially longer range than other alignment-free methods. The pattern-based approach outperforms alignment-free methods and its phylogenetic accuracy is statistically indistinguishable from alignment-based distances.
Resumo:
Switched mode power supplies (SMPSs) are essential components in many applications, and electromagnetic interference is an important consideration in the SMPS design. Spread spectrum based PWM strategies have been used in SMPS designs to reduce the switching harmonics. This paper proposes a novel method to integrate a communication function into spread spectrum based PWM strategy without extra hardware costs. Direct sequence spread spectrum (DSSS) and phase shift keying (PSK) data modulation are employed to the PWM of the SMPS, so that it has reduced switching harmonics and the input and output power line voltage ripples contain data. A data demodulation algorithm has been developed for receivers, and code division multiple access (CDMA) concept is employed as communication method for a system with multiple SMPSs. The proposed method has been implemented in both Buck and Boost converters. The experimental results validated the proposed DSSS based PWM strategy for both harmonic reduction and communication.
Molecular protein function prediction using sequence similarity-based and similarity-free approaches
Resumo:
Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal.
Molecular protein function prediction using sequence similarity-based and similarity-free approaches
Resumo:
Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal.
Resumo:
Ochnaceae s.str. (Malpighiales) are a pantropical family of about 500 species and 27 genera of almost exclusively woody plants. Infrafamilial classification and relationships have been controversial partially due to the lack of a robust phylogenetic framework. Including all genera except Indosinia and Perissocarpa and DNA sequence data for five DNA regions (ITS, matK, ndhF, rbcL, trnL-F), we provide for the first time a nearly complete molecular phylogenetic analysis of Ochnaceae s.l. resolving most of the phylogenetic backbone of the family. Based on this, we present a new classification of Ochnaceae s.l., with Medusagynoideae and Quiinoideae included as subfamilies and the former subfamilies Ochnoideae and Sauvagesioideae recognized at the rank of tribe. Our data support a monophyletic Ochneae, but Sauvagesieae in the traditional circumscription is paraphyletic because Testulea emerges as sister to the rest of Ochnoideae, and the next clade shows Luxemburgia+Philacra as sister group to the remaining Ochnoideae. To avoid paraphyly, we classify Luxemburgieae and Testuleeae as new tribes. The African genus Lophira, which has switched between subfamilies (here tribes) in past classifications, emerges as sister to all other Ochneae. Thus, endosperm-free seeds and ovules with partly to completely united integuments (resulting in an apparently single integument) are characters that unite all members of that tribe. The relationships within its largest clade, Ochnineae (former Ochneae), are poorly resolved, but former Ochninae (Brackenridgea, Ochna) are polyphyletic. Within Sauvagesieae, the genus Sauvagesia in its broad circumscription is polyphyletic as Sauvagesia serrata is sister to a clade of Adenarake, Sauvagesia spp., and three other genera. Within Quiinoideae, in contrast to former phylogenetic hypotheses, Lacunaria and Touroulia form a clade that is sister to Quiina. Bayesian ancestral state reconstructions showed that zygomorphic flowers with adaptations to buzz-pollination (poricidal anthers), a syncarpous gynoecium (a near-apocarpous gynoecium evolved independently in Quiinoideae and Ochninae), numerous ovules, septicidal capsules, and winged seeds with endosperm are the ancestral condition in Ochnoideae. Although in some lineages poricidal anthers were lost secondarily, the evolution of poricidal superstructures secured the maintenance of buzz-pollination in some of these genera, indicating a strong selective pressure on keeping that specialized pollination system.
Resumo:
High-throughput screening of physical, genetic and chemical-genetic interactions brings important perspectives in the Systems Biology field, as the analysis of these interactions provides new insights into protein/gene function, cellular metabolic variations and the validation of therapeutic targets and drug design. However, such analysis depends on a pipeline connecting different tools that can automatically integrate data from diverse sources and result in a more comprehensive dataset that can be properly interpreted. We describe here the Integrated Interactome System (IIS), an integrative platform with a web-based interface for the annotation, analysis and visualization of the interaction profiles of proteins/genes, metabolites and drugs of interest. IIS works in four connected modules: (i) Submission module, which receives raw data derived from Sanger sequencing (e.g. two-hybrid system); (ii) Search module, which enables the user to search for the processed reads to be assembled into contigs/singlets, or for lists of proteins/genes, metabolites and drugs of interest, and add them to the project; (iii) Annotation module, which assigns annotations from several databases for the contigs/singlets or lists of proteins/genes, generating tables with automatic annotation that can be manually curated; and (iv) Interactome module, which maps the contigs/singlets or the uploaded lists to entries in our integrated database, building networks that gather novel identified interactions, protein and metabolite expression/concentration levels, subcellular localization and computed topological metrics, GO biological processes and KEGG pathways enrichment. This module generates a XGMML file that can be imported into Cytoscape or be visualized directly on the web. We have developed IIS by the integration of diverse databases following the need of appropriate tools for a systematic analysis of physical, genetic and chemical-genetic interactions. IIS was validated with yeast two-hybrid, proteomics and metabolomics datasets, but it is also extendable to other datasets. IIS is freely available online at: http://www.lge.ibi.unicamp.br/lnbio/IIS/.
Resumo:
Avian pathogenic Escherichia coli (APEC) strains belong to a category that is associated with colibacillosis, a serious illness in the poultry industry worldwide. Additionally, some APEC groups have recently been described as potential zoonotic agents. In this work, we compared APEC strains with extraintestinal pathogenic E. coli (ExPEC) strains isolated from clinical cases of humans with extra-intestinal diseases such as urinary tract infections (UTI) and bacteremia. PCR results showed that genes usually found in the ColV plasmid (tsh, iucA, iss, and hlyF) were associated with APEC strains while fyuA, irp-2, fepC sitDchrom, fimH, crl, csgA, afa, iha, sat, hlyA, hra, cnf1, kpsMTII, clpVSakai and malX were associated with human ExPEC. Both categories shared nine serogroups (O2, O6, O7, O8, O11, O19, O25, O73 and O153) and seven sequence types (ST10, ST88, ST93, ST117, ST131, ST155, ST359, ST648 and ST1011). Interestingly, ST95, which is associated with the zoonotic potential of APEC and is spread in avian E. coli of North America and Europe, was not detected among 76 APEC strains. When the strains were clustered based on the presence of virulence genes, most ExPEC strains (71.7%) were contained in one cluster while most APEC strains (63.2%) segregated to another. In general, the strains showed distinct genetic and fingerprint patterns, but avian and human strains of ST359, or ST23 clonal complex (CC), presented more than 70% of similarity by PFGE. The results demonstrate that some zoonotic-related STs (ST117, ST131, ST10CC, ST23CC) are present in Brazil. Also, the presence of moderate fingerprint similarities between ST359 E. coli of avian and human origin indicates that strains of this ST are candidates for having zoonotic potential.
Resumo:
The complete SSU rDNA was sequenced for 10 individuals of Cladophora vagabunda collected along the coast of Brazil. For C. rupestris (L.) Kütz. a partial SSU rDNA sequence (1634 bp) was obtained. Phylogenetic trees indicate that Cladophora is paraphyletic, but the section Glomeratae sensu lato including C. vagabunda from Brazil, Japan and France, C. albida (Nees) Kütz., C. sericea (Hudson) Kütz., and C. glomerata (L.) Kütz. is monophyletic. Within this group C. vagabunda is paraphyletic. The sequence identity for the SSU rDNA varied from 98.9% to 100% for the Brazilian C. vagabunda, and from 98.3% to 99.7% comparing the Brazilian individuals to the ones from France and Japan. Sequence identity of the Brazilian C. vagabunda to C. albida and C. sericea vary from 98.0% to 98.6%. The SSU rDNA phylogeny support partially the morphological characteristics presented by Brazilian populations of C. vagabunda. On the other hand, C. rupestris from Brazil does not group with C. rupestris from France, both sequences presenting only 96.9% of identity. The inclusion of sequences of individuals from Brazil reinforces the need of taxonomical revision for the genus Cladophora and for the complex C. vagabunda.
Resumo:
The availaibilty of chloroplast genome (cpDNA) sequences of Atropa belladonna, Nicotiana sylvestris, N tabacum, N tomentosiformis, Solanum bulbocastanum, S lycopersicum and S tuberosum, which are Solanaceae species, allowed us to analyze the organization of cpSSRs in their genic and intergenic regions In general, the number of cpSSRs in cpDNA ranged from 161 in S tuberosum to 226 in N tabacum, and the number of intergenic cpSSRs was higher than genic cpSSRs The mononucleotide repeats were the most frequent in studied species, but we also identified di-, tri-, tetra-, penta- and hexanucleotide repeats Multiple alignments of all cpSSRs sequence from Solanaceae species made the identification of nucleotide variability possible and the phylogeny was estimated by maximum parsimony Our study showed that the plastome database can be exploited for phylogenetic analyses and biotechnological approaches
Resumo:
The strategy used to treat HCV infection depends on the genotype involved. An accurate and reliable genotyping method is therefore of paramount importance. We describe here, for the first time, the use of a liquid microarray for HCV genotyping. This liquid microarray is based on the 5'UTR - the most highly conserved region of HCV - and the variable region NS5B sequence. The simultaneous genotyping of two regions can be used to confirm findings and should detect inter-genotypic recombination. Plasma samples from 78 patients infected with viruses with genotypes and subtypes determined in the Versant (TM) HCV Genotype Assay LiPA (version I; Siemens Medical Solutions, Diagnostics Division, Fernwald, Germany) were tested with our new liquid microarray method. This method successfully determined the genotypes of 74 of the 78 samples previously genotyped in the Versant (TM) HCV Genotype Assay LiPA (74/78, 95%). The concordance between the two methods was 100% for genotype determination (74/74). At the subtype level, all 3a and 2b samples gave identical results with both methods (17/17 and 7/7, respectively). Two 2c samples were correctly identified by microarray, but could only be determined to the genotype level with the Versant (TM) HCV assay. Genotype ""1'' subtypes (1a and 1b) were correctly identified by the Versant (TM) HCV assay and the microarray in 68% and 40% of cases, respectively. No genotype discordance was found for any sample. HCV was successfully genotyped with both methods, and this is of prime importance for treatment planning. Liquid microarray assays may therefore be added to the list of methods suitable for HCV genotyping. It provides comparable results and may readily be adapted for the detection of other viruses frequently co-infecting HCV patients. Liquid array technology is thus a reliable and promising platform for HCV genotyping.
Resumo:
In this study, 222 genome survey sequences were generated for Trypanosoma rangeli strain P07 isolated from an opossum (Didelphis albiventris) in Minas Gerais State, Brazil. T. rangeli sequences were compared by BLASTX (Basic Local Alignment Search Tool X) analysis with the assembled contigs of Leishmania braziliensis, Leishmania infantum, Leishmania major, Trypanosoma brucei, and Trypanosoma cruzi. Results revealed that 82% (182/222) of the sequences were associated with predicted proteins described, whereas 18% (40/222) of the sequences did not show significant identity with sequences deposited in databases, suggesting that they may represent T. rangeli-specific sequences. Among the 182 predicted sequences, 179 (80.6%) had the highest similarity with T. cruzi, 2 (0.9%) with T. brucei, and 1 (0.5%) with L. braziliensis. Computer analysis permitted the identification of members of various gene families described for trypanosomatids in the genome of T. rangeli, such as trans-sialidases, mucin-associated surface proteins, and major surface proteases (MSP or gp63). This is the first report identifying sequences of the MSP family in T. rangeli. Multiple sequence alignments showed that the predicted MSP of T. rangeli presented the typical characteristics of metalloproteases, such as the presence of the HEXXH motif, which corresponds to a region previously associated with the catalytic site of the enzyme, and various cysteine and proline residues, which are conserved among MSPs of different trypanosomatid species. Reverse transcriptase-polymerase chain reaction analysis revealed the presence of MSP transcripts in epimastigote forms of T. rangeli.
Resumo:
Background: High-throughput molecular approaches for gene expression profiling, such as Serial Analysis of Gene Expression (SAGE), Massively Parallel Signature Sequencing (MPSS) or Sequencing-by-Synthesis (SBS) represent powerful techniques that provide global transcription profiles of different cell types through sequencing of short fragments of transcripts, denominated sequence tags. These techniques have improved our understanding about the relationships between these expression profiles and cellular phenotypes. Despite this, more reliable datasets are still necessary. In this work, we present a web-based tool named S3T: Score System for Sequence Tags, to index sequenced tags in accordance with their reliability. This is made through a series of evaluations based on a defined rule set. S3T allows the identification/selection of tags, considered more reliable for further gene expression analysis. Results: This methodology was applied to a public SAGE dataset. In order to compare data before and after filtering, a hierarchical clustering analysis was performed in samples from the same type of tissue, in distinct biological conditions, using these two datasets. Our results provide evidences suggesting that it is possible to find more congruous clusters after using S3T scoring system. Conclusion: These results substantiate the proposed application to generate more reliable data. This is a significant contribution for determination of global gene expression profiles. The library analysis with S3T is freely available at http://gdm.fmrp.usp.br/s3t/.S3T source code and datasets can also be downloaded from the aforementioned website.
Resumo:
The aim of this study was to investigate HIV-1 molecular diversity and the epidemiological profile of HIV-1-infected patients from Ribeirao Preto, Brazil. A nested PCR followed by sequencing of a 302-base pair fragment of the env gene (C2-V3 region) was performed in samples from HIV-1-positive patients. A total of 45 sequences were aligned with final manual adjustments. The phylogenetic analyses showed a higher prevalence of HIV-1 subtype B in the studied population (97.8%) with only one sample yielding an F1 subtype. The viral genotyping prediction showed that CCR5 tropism was the most prevalent in the studied cohort. Geno2pheno analysis showed that R5 and CXCR4 prediction were 69% and 31%, respectively. There was no statistical significance, either in viral load or in CD4(+) T cell count when R5 and X4 prediction groups were compared. Moreover, the GPGR tetramer was the most common V3 loop core motif identified in the HIV-1 strains studied (34.1%) followed by GWGR, identified in 18.1% of the samples. The high level of B subtype in this Brazilian population reinforces the nature of the HIV epidemic in Brazil, and corroborates previous data obtained in the Brazilian HIV-infected population.