986 resultados para Genome Search
Resumo:
Rapid evolution and high intrahost sequence diversity are hallmarks of human and simian immunodeficiency virus (HIV/SIV) infection. Minor viral variants have important implications for drug resistance, receptor tropism, and immune evasion. Here, we used ultradeep pyrosequencing to sequence complete HIV/SIV genomes, detecting variants present at a frequency as low as 1%. This approach provides a more complete characterization of the viral population than is possible with conventional methods, revealing low-level drug resistance and detecting previously hidden changes in the viral population. While this work applies pyrosequencing to immunodeficiency viruses, this approach could be applied to virtually any viral pathogen.
Resumo:
We present the first comprehensive study, to our knowledge, on genomic chromosomal analysis in syndromic craniosynostosis. In total, 45 patients with craniosynostotic disorders were screened with a variety of methods including conventional karyotype, microsatellite segregation analysis, subtelomeric multiplex ligation-dependent probe amplification) and whole-genome array-based comparative genome hybridisation. Causative abnormalities were present in 42.2% (19/45) of the samples, and 27.8% (10/36) of the patients with normal conventional karyotype carried submicroscopic imbalances. Our results include a wide variety of imbalances and point to novel chromosomal regions associated with craniosynostosis. The high incidence of pure duplications or trisomies suggests that these are important mechanisms in craniosynostosis, particularly in cases involving the metopic suture.
Resumo:
Feature selection is one of important and frequently used techniques in data preprocessing. It can improve the efficiency and the effectiveness of data mining by reducing the dimensions of feature space and removing the irrelevant and redundant information. Feature selection can be viewed as a global optimization problem of finding a minimum set of M relevant features that describes the dataset as well as the original N attributes. In this paper, we apply the adaptive partitioned random search strategy into our feature selection algorithm. Under this search strategy, the partition structure and evaluation function is proposed for feature selection problem. This algorithm ensures the global optimal solution in theory and avoids complete randomness in search direction. The good property of our algorithm is shown through the theoretical analysis.
Resumo:
Although patterns of somatic alterations have been reported for tumor genomes, little is known on how they compare with alterations present in non-tumor genomes. A comparison of the two would be crucial to better characterize the genetic alterations driving tumorigenesis. We sequenced the genomes of a lymphoblastoid (HCC1954BL) and a breast tumor (HCC1954) cell line derived from the same patient and compared the somatic alterations present in both. The lymphoblastoid genome presents a comparable number and similar spectrum of nucleotide substitutions to that found in the tumor genome. However, a significant difference in the ratio of non-synonymous to synonymous substitutions was observed between both genomes (P = 0.031). Protein-protein interaction analysis revealed that mutations in the tumor genome preferentially affect hub-genes (P = 0.0017) and are co-selected to present synergistic functions (P < 0.0001). KEGG analysis showed that in the tumor genome most mutated genes were organized into signaling pathways related to tumorigenesis. No such organization or synergy was observed in the lymphoblastoid genome. Our results indicate that endogenous mutagens and replication errors can generate the overall number of mutations required to drive tumorigenesis and that it is the combination rather than the frequency of mutations that is crucial to complete tumorigenic transformation.
Resumo:
Formal Concept Analysis is an unsupervised machine learning technique that has successfully been applied to document organisation by considering documents as objects and keywords as attributes. The basic algorithms of Formal Concept Analysis then allow an intelligent information retrieval system to cluster documents according to keyword views. This paper investigates the scalability of this idea. In particular we present the results of applying spatial data structures to large datasets in formal concept analysis. Our experiments are motivated by the application of the Formal Concept Analysis idea of a virtual filesystem [11,17,15]. In particular the libferris [1] Semantic File System. This paper presents customizations to an RD-Tree Generalized Index Search Tree based index structure to better support the application of Formal Concept Analysis to large data sources.
Resumo:
Schistosoma mansoni is responsible for the neglected tropical disease schistosomiasis that affects 210 million people in 76 countries. Here we present analysis of the 363 megabase nuclear genome of the blood fluke. It encodes at least 11,809 genes, with an unusual intron size distribution, and new families of micro-exon genes that undergo frequent alternative splicing. As the first sequenced flatworm, and a representative of the Lophotrochozoa, it offers insights into early events in the evolution of the animals, including the development of a body pattern with bilateral symmetry, and the development of tissues into organs. Our analysis has been informed by the need to find new drug targets. The deficits in lipid metabolism that make schistosomes dependent on the host are revealed, and the identification of membrane receptors, ion channels and more than 300 proteases provide new insights into the biology of the life cycle and new targets. Bioinformatics approaches have identified metabolic chokepoints, and a chemogenomic screen has pinpointed schistosome proteins for which existing drugs may be active. The information generated provides an invaluable resource for the research community to develop much needed new control tools for the treatment and eradication of this important and neglected disease.
Resumo:
To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
Resumo:
The complete genome sequence of wild-type rabies virus (RABV) isolated from a wild Brazilian hoary fox (Dusicyon sp.), the BR-Pfx1 isolate, was determined and compared with fixed RABV strains. The genome structure and organization of the BR-Pfx1 isolate were composed of 11,924 nt and included the five standard genes of rhabdoviruses. Sequences of mRNA start and stop signals for transcription were highly conserved among all structural protein genes of the BR-Pfx1 isolate. All amino acid residues in the glycoprotein (G) gene associated with pathogenicity were retained in the BR-Pfx1 isolate, while unique amino acid substitutions were found in antigenic region I of the nucleoprotein gene and III of G. These results suggest that although the standard genome structure and organization of the RABV isolate are common between the BR-Pfx1 isolate and fixed RABV strains, the unique amino acid substitutions in functional sites of the BR-Pfx1 isolate may result in different biological characteristics from fixed RABV strains.
Resumo:
Objective: To test the feasibility of an evidence-based clinical literature search service to help answer general practitioners' (GPs') clinical questions. Design: Two search services supplied GPs who submitted questions with the best available empirical evidence to answer these questions. The GPs provided feedback on the value of the service, and concordance of answers from the two search services was assessed. Setting: Two literature search services (Queensland and Victoria), operating for nine months from February 1999. Main outcome measures: Use of the service; time taken to locate answers; availability of evidence; value of the service to GPs; and consistency of answers from the two services. Results: 58 GPs asked 160 questions (29 asked one, 11 asked five or more). The questions concerned treatment (65%), aetiology (17%), prognosis (13%), and diagnosis (5%). Answering a question took a mean of 3 hours 32 minutes of personnel time (95% Cl, 2.67-3.97); nine questions took longer than 10 hours each to answer, the longest taking 23 hours 30 minutes. Evidence of suitable quality to provide a sound answer was available for 126 (79%) questions. Feedback data for 84 (53%) questions, provided by 42 GPs, showed that they appreciated the service, and asking the questions changed clinical care. There were many minor differences between the answers from the two centres, and substantial differences in the evidence found for 4/14 questions. However, conclusions reached were largely similar, with no or only minor differences for all questions. Conclusions: It is feasible to provide a literature search service, but further assessment is needed to establish its cost effectiveness.
Resumo:
The complete arrangement of genes in the mitochondrial (mt) genome is known for 12 species of insects, and part of the gene arrangement in the mt genome is known for over 300 other species of insects. The arrangement of genes in the mt genome is very conserved in insects studied, since all of the protein-coding and rRNA genes and most of the tRNA genes are arranged in the same way. We sequenced the entire mt genome of the wallaby louse, Heterodoxus macropus, which is 14,670 bp long and has the 37 genes typical of animals and some noncoding regions. The largest noncoding region is 73 bp long (93% A+T), and the second largest is 47 bp long (92% AST). Both of these noncoding regions seem to be able to form stem-loop structures. The arrangement of genes in the mt genome of this louse is unlike that of any other animal studied. All tRNA genes have moved and/or inverted relative to the ancestral gene arrangement of insects, which is present in the fruit fly Drosophila yakuba. At least nine protein-coding genes (atp6, atp8, cox2, cob, nad1-nad3, nad5, and nad6) have moved; moreover, four of these genes (atp6, atp8, nad1, and nad3) have inverted. The large number of gene rearrangements in the mt genome of H. macropus is unprecedented for an arthropod.