738 resultados para Annotation de génomes
Resumo:
The implementation of a new national curriculum and standards-referenced assessment in Australia has been an opportunity and a challenge for teacher assessment practices. In this case study of teachers in two Queensland schools, we explore how annotating student or exemplar assessment tasks could support teacher assessment practice. Three learning conversations between the researchers and the teacher teams are interpreted through the lens of Bernstein’s (1999) horizontal and vertical discourses to understand the complexities of coming to know an assessment standard. The study contributes to the literature on the use of annotations by exploring how teachers negotiated the purposes and processes of annotation, how annotating student work or exemplars before teaching commenced supported teachers to experience greater clarity about assessment standards and, finally, some of the tensions experienced by the teachers as they considered this practice within the practicalities of their daily work.
Resumo:
Chaperone-usher (CU) fimbriae are adhesive surface organelles common to many Gram-negative bacteria. Escherichia coli genomes contain a large variety of characterised and putative CU fimbrial operons, however, the classification and annotation of individual loci remains problematic. Here we describe a classification model based on usher phylogeny and genomic locus position to categorise the CU fimbrial types of E. coli. Using the BLASTp algorithm, an iterative usher protein search was performed to identify CU fimbrial operons from 35 E. coli (and one Escherichia fergusonnii) genomes representing different pathogenic and phylogenic lineages, as well as 132 Escherichia spp. plasmids. A total of 458 CU fimbrial operons were identified, which represent 38 distinct fimbrial types based on genomic locus position and usher phylogeny. The majority of fimbrial operon types occupied a specific locus position on the E. coli chromosome; exceptions were associated with mobile genetic elements. A group of core-associated E. coli CU fimbriae were defined and include the Type 1, Yad, Yeh, Yfc, Mat, F9 and Ybg fimbriae. These genes were present as intact or disrupted operons at the same genetic locus in almost all genomes examined. Evaluation of the distribution and prevalence of CU fimbrial types among different pathogenic and phylogenic groups provides an overview of group specific fimbrial profiles and insight into the ancestry and evolution of CU fimbriae in E. coli.
Resumo:
Determination of sequence similarity is a central issue in computational biology, a problem addressed primarily through BLAST, an alignment based heuristic which has underpinned much of the analysis and annotation of the genomic era. Despite their success, alignment-based approaches scale poorly with increasing data set size, and are not robust under structural sequence rearrangements. Successive waves of innovation in sequencing technologies – so-called Next Generation Sequencing (NGS) approaches – have led to an explosion in data availability, challenging existing methods and motivating novel approaches to sequence representation and similarity scoring, including adaptation of existing methods from other domains such as information retrieval. In this work, we investigate locality-sensitive hashing of sequences through binary document signatures, applying the method to a bacterial protein classification task. Here, the goal is to predict the gene family to which a given query protein belongs. Experiments carried out on a pair of small but biologically realistic datasets (the full protein repertoires of families of Chlamydia and Staphylococcus aureus genomes respectively) show that a measure of similarity obtained by locality sensitive hashing gives highly accurate results while offering a number of avenues which will lead to substantial performance improvements over BLAST..
Resumo:
Due to the popularity of security cameras in public places, it is of interest to design an intelligent system that can efficiently detect events automatically. This paper proposes a novel algorithm for multi-person event detection. To ensure greater than real-time performance, features are extracted directly from compressed MPEG video. A novel histogram-based feature descriptor that captures the angles between extracted particle trajectories is proposed, which allows us to capture motion patterns of multi-person events in the video. To alleviate the need for fine-grained annotation, we propose the use of Labelled Latent Dirichlet Allocation, a “weakly supervised” method that allows the use of coarse temporal annotations which are much simpler to obtain. This novel system is able to run at approximately ten times real-time, while preserving state-of-theart detection performance for multi-person events on a 100-hour real-world surveillance dataset (TRECVid SED).
Resumo:
Assessment for Learning practices with students such as feedback, and self- and peer assessment are opportunities for teachers and students to develop a shared understanding of how to create quality learning performances. Quality is often represented through achievement standards. This paper explores how primary school teachers in Australia used the process of annotating work samples to develop shared understanding of achievement standards during their curriculum planning phase, and how this understanding informed their teaching so that their students also developed this understanding. Bernstein's concept of the pedagogic device is used to identify the ways teachers recontextualised their assessment knowledge into their pedagogic practices. Two researchers worked alongside seven primary school teachers in two schools over a year, gathering qualitative data through focus groups and interviews. Three general recontextualising approaches were identified in the case studies; recontextualising standards by reinterpreting the role of rubrics, recontextualising by replicating the annotation process with the students and recontextualising by reinterpreting practices with students. While each approach had strengths and limitations, all of the teachers concluded that annotating conversations in the planning phase enhanced their understanding, and informed their practices in helping students to understand expectations for quality.
Resumo:
Background The koala, Phascolarctos cinereus, is a biologically unique and evolutionarily distinct Australian arboreal marsupial. The goal of this study was to sequence the transcriptome from several tissues of two geographically separate koalas, and to create the first comprehensive catalog of annotated transcripts for this species, enabling detailed analysis of the unique attributes of this threatened native marsupial, including infection by the koala retrovirus. Results RNA-Seq data was generated from a range of tissues from one male and one female koala and assembled de novo into transcripts using Velvet-Oases. Transcript abundance in each tissue was estimated. Transcripts were searched for likely protein-coding regions and a non-redundant set of 117,563 putative protein sequences was produced. In similarity searches there were 84,907 (72%) sequences that aligned to at least one sequence in the NCBI nr protein database. The best alignments were to sequences from other marsupials. After applying a reciprocal best hit requirement of koala sequences to those from tammar wallaby, Tasmanian devil and the gray short-tailed opossum, we estimate that our transcriptome dataset represents approximately 15,000 koala genes. The marsupial alignment information was used to look for potential gene duplications and we report evidence for copy number expansion of the alpha amylase gene, and of an aldehyde reductase gene. Koala retrovirus (KoRV) transcripts were detected in the transcriptomes. These were analysed in detail and the structure of the spliced envelope gene transcript was determined. There was appreciable sequence diversity within KoRV, with 233 sites in the KoRV genome showing small insertions/deletions or single nucleotide polymorphisms. Both koalas had sequences from the KoRV-A subtype, but the male koala transcriptome has, in addition, sequences more closely related to the KoRV-B subtype. This is the first report of a KoRV-B-like sequence in a wild population. Conclusions This transcriptomic dataset is a useful resource for molecular genetic studies of the koala, for evolutionary genetic studies of marsupials, for validation and annotation of the koala genome sequence, and for investigation of koala retrovirus. Annotated transcripts can be browsed and queried at http://koalagenome.org
Resumo:
Active learning approaches reduce the annotation cost required by traditional supervised approaches to reach the same effectiveness by actively selecting informative instances during the learning phase. However, effectiveness and robustness of the learnt models are influenced by a number of factors. In this paper we investigate the factors that affect the effectiveness, more specifically in terms of stability and robustness, of active learning models built using conditional random fields (CRFs) for information extraction applications. Stability, defined as a small variation of performance when small variation of the training data or a small variation of the parameters occur, is a major issue for machine learning models, but even more so in the active learning framework which aims to minimise the amount of training data required. The factors we investigate are a) the choice of incremental vs. standard active learning, b) the feature set used as a representation of the text (i.e., morphological features, syntactic features, or semantic features) and c) Gaussian prior variance as one of the important CRFs parameters. Our empirical findings show that incremental learning and the Gaussian prior variance lead to more stable and robust models across iterations. Our study also demonstrates that orthographical, morphological and contextual features as a group of basic features play an important role in learning effective models across all iterations.
Resumo:
In our large library of annotated environmental recordings of animal vocalizations, searching annotations by label can return thousands of results. We propose a heat map of aggregated annotation time and frequency bounds, maintaining the shape of the annotations as they appear on the spectrogram. This compactly displays the distribution of annotation bounds for the user's query, and allows them to easily identify unusual annotations. Key to this is allowing zero values on the map to be differentiated from areas where there are single annotations.
Resumo:
One main challenge in developing a system for visual surveillance event detection is the annotation of target events in the training data. By making use of the assumption that events with security interest are often rare compared to regular behaviours, this paper presents a novel approach by using Kullback-Leibler (KL) divergence for rare event detection in a weakly supervised learning setting, where only clip-level annotation is available. It will be shown that this approach outperforms state-of-the-art methods on a popular real-world dataset, while preserving real time performance.
Resumo:
Viewer interests, evoked by video content, can potentially identify the highlights of the video. This paper explores the use of facial expressions (FE) and heart rate (HR) of viewers captured using camera and non-strapped sensor for identifying interesting video segments. The data from ten subjects with three videos showed that these signals are viewer dependent and not synchronized with the video contents. To address this issue, new algorithms are proposed to effectively combine FE and HR signals for identifying the time when viewer interest is potentially high. The results show that, compared with subjective annotation and match report highlights, ‘non-neutral’ FE and ‘relatively higher and faster’ HR is able to capture 60%-80% of goal, foul, and shot-on-goal soccer video events. FE is found to be more indicative than HR of viewer’s interests, but the fusion of these two modalities outperforms each of them.
Resumo:
Extracellular polysaccharides are major immunogenic components of the bacterial cell envelope. However, little is known about their biosynthesis in the genus Acinetobacter, which includes A. baumannii, an important nosocomial pathogen. Whether Acinetobacter sp. produce a capsule or a lipopolysaccharide carrying an O antigen or both is not resolved. To explore these issues, genes involved in the synthesis of complex polysaccharides were located in 10 complete A. baumannii genome sequences, and the function of each of their products was predicted via comparison to enzymes with a known function. The absence of a gene encoding a WaaL ligase, required to link the carbohydrate polymer to the lipid A-core oligosaccharide (lipooligosaccharide) forming lipopolysaccharide, suggests that only a capsule is produced. Nine distinct arrangements of a large capsule biosynthesis locus, designated KL1 to KL9, were found in the genomes. Three forms of a second, smaller variable locus, likely to be required for synthesis of the outer core of the lipid A-core moiety, were designated OCL1 to OCL3 and also annotated. Each K locus includes genes for capsule export as well as genes for synthesis of activated sugar precursors, and for glycosyltransfer, glycan modification and oligosaccharide repeat-unit processing. The K loci all include the export genes at one end and genes for synthesis of common sugar precursors at the other, with a highly variable region that includes the remaining genes in between. Five different capsule loci, KL2, KL6, KL7, KL8 and KL9 were detected in multiply antibiotic resistant isolates belonging to global clone 2, and two other loci, KL1 and KL4, in global clone 1. This indicates that this region is being substituted repeatedly in multiply antibiotic resistant isolates from these clones.
Resumo:
In the Yersinia pseudotuberculosis serotyping scheme, 21 serotypes are present originating from about 30 different O-factors distributed within the species. With regard to the chemical structures of lipopolysaccharides (LPSs) and the genetic basis of their biosynthesis, a number, but not all, of Y. pseudotuberculosis strains representing different serotypes have been investigated. In order to present an overall picture of the relationship between genetics and structures, we have been working on the genetics and structures of various Y. pseudotuberculosis O-specific polysaccharides (OPSs). Here, we present a structural and genetic analysis of the Y. pseudotuberculosis serotype O:11 OPS. Our results showed that this OPS structure has the same backbone as that of Y. pseudotuberculosis O:1b, but with a 6d-l-Altf side-branch instead of Parf. The 3′ end of the gene cluster is the same as that for O:1b and has the genes for synthesis of the backbone and for processing the completed repeat unit. The 5′ end has genes for synthesis of 6d-l-Altf and its transfer to the repeating unit backbone. The pathway for the synthesis of the 6d-l-Altf appears to be different from that for 6d-l-Altp in Y. enterocolitica O:3. The chemical structure of the O:11 repeating unit is [Figure]
Resumo:
The O-specific polysaccharide (OPS) is a variable constituent of the lipopolysaccharide of Gram-negative bacteria. The polymorphic nature of OPSs within a species is usually first defined serologically, and the current serotyping scheme for Yersinia pseudotuberculosis consists of 21 O serotypes of which 15 have been characterized genetically and structurally. Here, we present the structure and DNA sequence of Y. pseudotuberculosis O:10 OPS. The O unit consists of one residue each of d-galactopyranose, N-acetyl-d-galactosamine (2-amino-2-deoxy-d-galactopyranose) and d-glucopyranose in the backbone, with two colitose (3,6-dideoxy-l-xylo-hexopyranose) side-branch residues. This structure is very similar to that shared by Escherichia coli O111 and Salmonella enterica O35. The gene cluster sequences of these serotypes, however, have only low levels of similarity to that of Y. pseudotuberculosis O:10, although there is significant conservation of gene order. Within Y. pseudotuberculosis, the O10 structure is most closely related to the O:6 and O:7 structures.