991 resultados para sequence database
Resumo:
Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.
Resumo:
This article uses topological approaches to suggest that education is becoming-topological. Analyses presented in a recent double-issue of Theory, Culture & Society are used to demonstrate the utility of topology for education. In particular, the article explains education's topological character through examining the global convergence of education policy, testing and the discursive ranking of systems, schools and individuals in the promise of reforming education through the proliferation of regimes of testing at local and global levels that constitute a new form of governance through data. In this conceptualisation of global education policy changes in the form and nature of testing combine with it the emergence of global policy network to change the nature of the local (national, regional, school and classroom) forces that operate through the ‘system’. While these forces change, they work through a discursivity that produces disciplinary effects, but in a different way. This new–old disciplinarity, or ‘database effect’, is here represented through a topological approach because of its utility for conceiving education in an increasingly networked world.
Resumo:
Fossils provide the principal basis for temporal calibrations, which are critical to the accuracy of divergence dating analyses. Translating fossil data into minimum and maximum bounds for calibrations is the most important, and often least appreciated, step of divergence dating. Properly justified calibrations require the synthesis of phylogenetic, paleontological, and geological evidence and can be difficult for non-specialists to formulate. The dynamic nature of the fossil record (e.g., new discoveries, taxonomic revisions, updates of global or local stratigraphy) requires that calibration data be updated continually lest they become obsolete. Here, we announce the Fossil Calibration Database (http://fossilcalibrations.org), a new open-access resource providing vetted fossil calibrations to the scientific community. Calibrations accessioned into this database are based on individual fossil specimens and follow best practices for phylogenetic justification and geochronological constraint. The associated Fossil Calibration Series, a calibration-themed publication series at Palaeontologia Electronica, will serve as one key pipeline for peer-reviewed calibrations to enter the database.
Resumo:
The human genome project was a grand scientific enterprise which attracted both hyperbole and ridicule alike. The project was lauded as “the moon shot of the life sciences”, the “holy grail of man”, “the code of codes”, and “the book of life”. Such rhetoric has also received scorn. President George Bush senior managed to deflate the pretensions of the project with the accidental slip that it was the “human gnome initiative”. In The Sequence, Kevin Davies seeks to go beyond such metaphors, and provide a candid and honest account of the race of the human genome project. The author is indebted to the authoritative book The Gene Wars, which considered the early struggles over the human genome project. Robert Cook-Deegan observes that there was initially much debate over whether there should be a Human Genome Project at all: The debate became one of “big” science versus “small” science. The reliance on systematic technology development and goal-directed gene-mapping efforts presaged a new style for biology, one that elicited excitement from those attracted to whiz-bang technologies but drew gasps of revulsion from those who aspired to cultivate biology on a more modest scale and with decentralized organisation. The battle was, among other things, over whose vision would control the budget and which scientific aesthetic would prevail.
Resumo:
It's akin to the old Spanish, English and Portuguese explorers. They would take their boats until they found some edge of land, then they would go up and plant the flag of their king or queen. They didn't know what they'd discovered; how big it is, where it goes to - but they would claim it anyway. David Korn of the Association of American Medical Colleges This article analyses recent litigation over patent law and expressed sequence tags (ESTs). In the case of In re Fisher, the United States Court of Appeals for the Federal Circuit engaged in judicial consideration of the revised utility guidelines of the United States Patent and Trademark Office (USPTO). In this matter, the agricultural biotechnology company Monsanto sought to patent ESTs in maize plants. A patent examiner and the Board of Patent Appeals and Interferences had doubted whether the patent application was useful. Monsanto appealed against the rulings of the USPTO. A number of amicus curiae intervened in the matter in support of the USPTO - including Genentech, Affymetrix, Dow AgroSciences, Eli Lilly, the National Academy of Sciences, and the Association of American Medical Colleges. The majority of the Court of Appeals for the Federal Circuit supported the position of the USPTO, and rejected the patent application on the grounds of utility. The split decision highlighted institutional tensions over the appropriate thresholds for patent criteria - such as novelty, non-obviousness, and utility. The litigation raised larger questions about the definition of research tools, the incremental nature of scientific progress, and the role of patent law in innovation policy. The decision of In re Fisher will have significant ramifications for gene patents, in the wake of the human genome project. Arguably, the USPTO utility guidelines need to be reinforced by a tougher application of the standards of novelty and non-obviousness in respect of gene patents.
Resumo:
This paper addresses the problem of predicting the outcome of an ongoing case of a business process based on event logs. In this setting, the outcome of a case may refer for example to the achievement of a performance objective or the fulfillment of a compliance rule upon completion of the case. Given a log consisting of traces of completed cases, given a trace of an ongoing case, and given two or more possible out- comes (e.g., a positive and a negative outcome), the paper addresses the problem of determining the most likely outcome for the case in question. Previous approaches to this problem are largely based on simple symbolic sequence classification, meaning that they extract features from traces seen as sequences of event labels, and use these features to construct a classifier for runtime prediction. In doing so, these approaches ignore the data payload associated to each event. This paper approaches the problem from a different angle by treating traces as complex symbolic sequences, that is, sequences of events each carrying a data payload. In this context, the paper outlines different feature encodings of complex symbolic sequences and compares their predictive accuracy on real-life business process event logs.
Resumo:
Idiomarina sp. strain 28-8 is an aerobic, Gram-negative, flagellar bacterium isolated from the bodies of ark shells (Scapharca broughtonii) collected from underwater sediments in Gangjin Bay, South Korea. Here, we present the draft genome sequence of Idiomarina sp. 28-8 (2,971,606 bp, with a G+C content of 46.9%), containing 2,795 putative coding sequences.
Resumo:
The native Asian oyster, Crassostrea ariakensis is one of the most common and important Crassostrea species that occur naturally along the coast of East Asia. Molecular species diagnosis is a prerequisite for population genetic analysis of wild oyster populations because oyster species cannot be discriminated reliably using external morphological characters alone due to character ambiguity. To date there have been few phylogeographic studies of natural edible oyster populations in East Asia, in particular this is true of the common species in Korea C. ariakensis. We therefore assessed the levels and patterns of molecular genetic variation in East Asian wild populations of C. ariakensis from Korea, Japan, and China using DNA sequence analysis of five concatenated mtDNA regions namely; 16S rRNA, cytochrome oxidase I, cytochrome oxidase II, cytochrome oxidase III, and cytochrome b. Two divergent C. ariakensis clades were identified between southern China and remaining sites from the northern region. In addition, hierarchical AMOVA and pairwise UST analyses showed that genetic diversity was discontinuous among wild populations of C. ariakensis in East Asia. Biogeographical and historical sea level changes are discussed as potential factors that may have influenced the genetic heterogeneity of wild C. ariakensis stocks across this region.
Resumo:
Single nucleotide polymorphisms (SNPs) are widely acknowledged as the marker of choice for many genetic and genomic applications because they show co-dominant inheritance, are highly abundant across genomes and are suitable for high-throughput genotyping. Here we evaluated the applicability of SNP markers developed from Crassostrea gigas and C. virginica expressed sequence tags (ESTs) in closely related Crassostrea and Ostrea species. A total of 213 putative interspecific level SNPs were identified from re-sequencing data in six amplicons, yielding on average of one interspecific level SNP per seven bp. High polymorphism levels were observed and the high success rate of transferability show that genic EST-derived SNP markers provide an efficient method for rapid marker development and SNP discovery in closely related oyster species. The six EST-SNP markers identified here will provide useful molecular tools for addressing questions in molecular ecology and evolution studies including for stock analysis (pedigree monitoring) in related oyster taxa.
Resumo:
Striped catfish (Pangasianodon hypophthalmus) is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The culture industry is facing a significant challenge however from saltwater intrusion into many low topographical coastal provinces across the Mekong Delta as a result of predicted climate change impacts. Developing genomic resources for this species can facilitate the production of improved culture lines that can withstand raised salinity conditions, and so we have applied high-throughput Ion Torrent sequencing of transcriptome libraries from six target osmoregulatory organs from striped catfish as a genomic resource for use in future selection strategies. We obtained 12,177,770 reads after trimming and processing with an average length of 97 bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 66,451 contigs with an average length of 478 bp and N50 length of 506 bp. A total of 37,969 contigs (57%) possessed significant similarity with proteins in the non-redundant database. Comparative analyses revealed that a significant number of contigs matched sequences reported in other teleost fishes, ranging in similarity from 45.2% with Atlantic cod to 52% with zebrafish. In addition, 28,879 simple sequence repeats (SSRs) and 55,721 single nucleotide polymorphisms (SNPs) were detected in the striped catfish transcriptome. The sequence collection generated in the current study represents the most comprehensive genomic resource for P. hypophthalmus available to date. Our results illustrate the utility of next-generation sequencing as an efficient tool for constructing a large genomic database for marker development in non-model species.
Resumo:
Genome-wide association studies (GWAS) have identified around 60 common variants associated with multiple sclerosis (MS), but these loci only explain a fraction of the heritability of MS. Some missing heritability may be caused by rare variants that have been suggested to play an important role in the aetiology of complex diseases such as MS. However current genetic and statistical methods for detecting rare variants are expensive and time consuming. 'Population-based linkage analysis' (PBLA) or so called identity-by-descent (IBD) mapping is a novel way to detect rare variants in extant GWAS datasets. We employed BEAGLE fastIBD to search for rare MS variants utilising IBD mapping in a large GWAS dataset of 3,543 cases and 5,898 controls. We identified a genome-wide significant linkage signal on chromosome 19 (LOD = 4.65; p = 1.9×10-6). Network analysis of cases and controls sharing haplotypes on chromosome 19 further strengthened the association as there are more large networks of cases sharing haplotypes than controls. This linkage region includes a cluster of zinc finger genes of unknown function. Analysis of genome wide transcriptome data suggests that genes in this zinc finger cluster may be involved in very early developmental regulation of the CNS. Our study also indicates that BEAGLE fastIBD allowed identification of rare variants in large unrelated population with moderate computational intensity. Even with the development of whole-genome sequencing, IBD mapping still may be a promising way to narrow down the region of interest for sequencing priority. © 2013 Lin et al.
Resumo:
Objective To determine the relative effects of genetic and environmental factors in susceptibility to ankylosing spondylitis (AS). Methods Twins with AS were identified from the Royal National Hospital for Rheumatic Diseases database. Clinical and radiographic examinations were performed to establish diagnoses, and disease severity was assessed using a combination of validated scoring systems. HLA typing for HLA-B27, HLA-B60, and HLA-DR1 was performed by polymerase chain reaction with sequence- specific primers, and zygosity was assessed using microsatellite markers. Genetic and environmental variance components were assessed with the program Mx, using data from this and previous studies of twins with AS. Results Six of 8 monozygotic (MZ) twin pairs were disease concordant, compared with 4 of 15 B27-positive dizygotic (DZ) twin pairs (27%) and 4 of 32 DZ twin pairs overall (12.5%). Nonsignificant increases in similarity with regard to age at disease onset and all of the disease severity scores assessed were noted in disease-concordant MZ twins compared with concordant DZ twins. HLA-B27 and B60 were associated with the disease in probands, and the rate of disease concordance was significantly increased among DZ twin pairs in which the co- twin was positive for both B27 and DR1. Additive genetic effects were estimated to contribute 97% of the population variance. Conclusion Susceptibility to AS is largely genetically determined, and the environmental trigger for the disease is probably ubiquitous. HLA-B27 accounts for a minority of the overall genetic susceptibility to AS.
Resumo:
Molecular phylogenetic studies of homologous sequences of nucleotides often assume that the underlying evolutionary process was globally stationary, reversible, and homogeneous (SRH), and that a model of evolution with one or more site-specific and time-reversible rate matrices (e.g., the GTR rate matrix) is enough to accurately model the evolution of data over the whole tree. However, an increasing body of data suggests that evolution under these conditions is an exception, rather than the norm. To address this issue, several non-SRH models of molecular evolution have been proposed, but they either ignore heterogeneity in the substitution process across sites (HAS) or assume it can be modeled accurately using the distribution. As an alternative to these models of evolution, we introduce a family of mixture models that approximate HAS without the assumption of an underlying predefined statistical distribution. This family of mixture models is combined with non-SRH models of evolution that account for heterogeneity in the substitution process across lineages (HAL). We also present two algorithms for searching model space and identifying an optimal model of evolution that is less likely to over- or underparameterize the data. The performance of the two new algorithms was evaluated using alignments of nucleotides with 10 000 sites simulated under complex non-SRH conditions on a 25-tipped tree. The algorithms were found to be very successful, identifying the correct HAL model with a 75% success rate (the average success rate for assigning rate matrices to the tree's 48 edges was 99.25%) and, for the correct HAL model, identifying the correct HAS model with a 98% success rate. Finally, parameter estimates obtained under the correct HAL-HAS model were found to be accurate and precise. The merits of our new algorithms were illustrated with an analysis of 42 337 second codon sites extracted from a concatenation of 106 alignments of orthologous genes encoded by the nuclear genomes of Saccharomyces cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, S. castellii, S. kluyveri, S. bayanus, and Candida albicans. Our results show that second codon sites in the ancestral genome of these species contained 49.1% invariable sites, 39.6% variable sites belonging to one rate category (V1), and 11.3% variable sites belonging to a second rate category (V2). The ancestral nucleotide content was found to differ markedly across these three sets of sites, and the evolutionary processes operating at the variable sites were found to be non-SRH and best modeled by a combination of eight edge-specific rate matrices (four for V1 and four for V2). The number of substitutions per site at the variable sites also differed markedly, with sites belonging to V1 evolving slower than those belonging to V2 along the lineages separating the seven species of Saccharomyces. Finally, sites belonging to V1 appeared to have ceased evolving along the lineages separating S. cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, and S. bayanus, implying that they might have become so selectively constrained that they could be considered invariable sites in these species.
Resumo:
With new national targets for patient flow in public hospitals designed to increase efficiencies in patient care and resource use, better knowledge of events affecting length of stay will support improved bed management and scheduling of procedures. This paper presents a case study involving the integration of material from each of three databases in operation at one tertiary hospital and demonstrates it is possible to follow patient journeys from admission to discharge. What is known about this topic? At present, patient data at one Queensland tertiary hospital are assembled in three information systems: (1) the Hospital Based Corporate Information System (HBCIS), which tracks patients from in-patient admission to discharge; (2) the Emergency Department Information System (EDIS) containing patient data from presentation to departure from the emergency department; and (3) Operation Room Management Information System (ORMIS), which records surgical operations. What does this paper add? This paper describes how a new enquiry tool may be used to link the three hospital information systems for studying the hospital journey through different wards and/or operating theatres for both individual and groups of patients. What are the implications for practitioners? An understanding of the patients’ journeys provides better insight into patient flow and provides the tool for research relating to access block, as well as optimising the use of physical and human resources.
Resumo:
The koala (Phascolarctos cinereus) is an Australian marsupial that continues to experience significant population declines. Infectious diseases caused by pathogens such as Chlamydia are proposed to have a major role. Very few species-specific immunological reagents are available, severely hindering our ability to respond to the threat of infectious diseases in the koala. In this study, we utilise data from the sequencing of the koala transcriptome to identify key immunological markers of the koala adaptive immune response and cytokines known to be important in the host response to chlamydial infection in other species. This report describes the identification and preliminary sequence analysis of (1) T lymphocyte glycoprotein markers (CD4, CD8); (2) IL-4, a marker for the Th2 response; (3) cytokines such as IL-6, IL-12 and IL-1β, that have been shown to have a role in chlamydial clearance and pathology in other hosts; and (4) the sequences for the koala immunoglobulins, IgA, IgG, IgE and IgM. These sequences will enable the development of a range of immunological reagents for understanding the koala’s innate and adaptive immune responses, while also providing a resource that will enable continued investigations into the origin and evolution of the marsupial immune system.