680 resultados para Annotation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The exocarp, or skin, of fleshy fruit is a specialized tissue that protects the fruit, attracts seed dispersing fruit eaters, and has large economical relevance for fruit quality. Development of the exocarp involves regulated activities of many genes. This research analyzed global gene expression in the exocarp of developing sweet cherry (Prunus avium L., 'Regina'), a fruit crop species with little public genomic resources. A catalog of transcript models (contigs) representing expressed genes was constructed from de novo assembled short complementary DNA (cDNA) sequences generated from developing fruit between flowering and maturity at 14 time points. Expression levels in each sample were estimated for 34 695 contigs from numbers of reads mapping to each contig. Contigs were annotated functionally based on BLAST, gene ontology and InterProScan analyses. Coregulated genes were detected using partitional clustering of expression patterns. The results are discussed with emphasis on genes putatively involved in cuticle deposition, cell wall metabolism and sugar transport. The high temporal resolution of the expression patterns presented here reveals finely tuned developmental specialization of individual members of gene families. Moreover, the de novo assembled sweet cherry fruit transcriptome with 7760 full-length protein coding sequences and over 20 000 other, annotated cDNA sequences together with their developmental expression patterns is expected to accelerate molecular research on this important tree fruit crop.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the rise of smart phones, lifelogging devices (e.g. Google Glass) and popularity of image sharing websites (e.g. Flickr), users are capturing and sharing every aspect of their life online producing a wealth of visual content. Of these uploaded images, the majority are poorly annotated or exist in complete semantic isolation making the process of building retrieval systems difficult as one must firstly understand the meaning of an image in order to retrieve it. To alleviate this problem, many image sharing websites offer manual annotation tools which allow the user to “tag” their photos, however, these techniques are laborious and as a result have been poorly adopted; Sigurbjörnsson and van Zwol (2008) showed that 64% of images uploaded to Flickr are annotated with < 4 tags. Due to this, an entire body of research has focused on the automatic annotation of images (Hanbury, 2008; Smeulders et al., 2000; Zhang et al., 2012a) where one attempts to bridge the semantic gap between an image’s appearance and meaning e.g. the objects present. Despite two decades of research the semantic gap still largely exists and as a result automatic annotation models often offer unsatisfactory performance for industrial implementation. Further, these techniques can only annotate what they see, thus ignoring the “bigger picture” surrounding an image (e.g. its location, the event, the people present etc). Much work has therefore focused on building photo tag recommendation (PTR) methods which aid the user in the annotation process by suggesting tags related to those already present. These works have mainly focused on computing relationships between tags based on historical images e.g. that NY and timessquare co-exist in many images and are therefore highly correlated. However, tags are inherently noisy, sparse and ill-defined often resulting in poor PTR accuracy e.g. does NY refer to New York or New Year? This thesis proposes the exploitation of an image’s context which, unlike textual evidences, is always present, in order to alleviate this ambiguity in the tag recommendation process. Specifically we exploit the “what, who, where, when and how” of the image capture process in order to complement textual evidences in various photo tag recommendation and retrieval scenarios. In part II, we combine text, content-based (e.g. # of faces present) and contextual (e.g. day-of-the-week taken) signals for tag recommendation purposes, achieving up to a 75% improvement to precision@5 in comparison to a text-only TF-IDF baseline. We then consider external knowledge sources (i.e. Wikipedia & Twitter) as an alternative to (slower moving) Flickr in order to build recommendation models on, showing that similar accuracy could be achieved on these faster moving, yet entirely textual, datasets. In part II, we also highlight the merits of diversifying tag recommendation lists before discussing at length various problems with existing automatic image annotation and photo tag recommendation evaluation collections. In part III, we propose three new image retrieval scenarios, namely “visual event summarisation”, “image popularity prediction” and “lifelog summarisation”. In the first scenario, we attempt to produce a rank of relevant and diverse images for various news events by (i) removing irrelevant images such memes and visual duplicates (ii) before semantically clustering images based on the tweets in which they were originally posted. Using this approach, we were able to achieve over 50% precision for images in the top 5 ranks. In the second retrieval scenario, we show that by combining contextual and content-based features from images, we are able to predict if it will become “popular” (or not) with 74% accuracy, using an SVM classifier. Finally, in chapter 9 we employ blur detection and perceptual-hash clustering in order to remove noisy images from lifelogs, before combining visual and geo-temporal signals in order to capture a user’s “key moments” within their day. We believe that the results of this thesis show an important step towards building effective image retrieval models when there lacks sufficient textual content (i.e. a cold start).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Transcription activator-like effectors (TALEs) are virulence factors, produced by the bacterial plant-pathogen Xanthomonas, that function as gene activators inside plant cells. Although the contribution of individual TALEs to infectivity has been shown, the specific roles of most TALEs, and the overall TALE diversity in Xanthomonas spp. is not known. TALEs possess a highly repetitive DNA-binding domain, which is notoriously difficult to sequence. Here, we describe an improved method for characterizing TALE genes by the use of PacBio sequencing. We present 'AnnoTALE', a suite of applications for the analysis and annotation of TALE genes from Xanthomonas genomes, and for grouping similar TALEs into classes. Based on these classes, we propose a unified nomenclature for Xanthomonas TALEs that reveals similarities pointing to related functionalities. This new classification enables us to compare related TALEs and to identify base substitutions responsible for the evolution of TALE specificities. © 2016, Nature Publishing Group. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Annotation Pro - a description of techniques, methods implemented in the tool, as well as the list of all built in functionalities and features of the user interface, and usage tips.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automatic video segmentation plays a vital role in sports videos annotation. This paper presents a fully automatic and computationally efficient algorithm for analysis of sports videos. Various methods of automatic shot boundary detection have been proposed to perform automatic video segmentation. These investigations mainly concentrate on detecting fades and dissolves for fast processing of the entire video scene without providing any additional feedback on object relativity within the shots. The goal of the proposed method is to identify regions that perform certain activities in a scene. The model uses some low-level feature video processing algorithms to extract the shot boundaries from a video scene and to identify dominant colours within these boundaries. An object classification method is used for clustering the seed distributions of the dominant colours to homogeneous regions. Using a simple tracking method a classification of these regions to active or static is performed. The efficiency of the proposed framework is demonstrated over a standard video benchmark with numerous types of sport events and the experimental results show that our algorithm can be used with high accuracy for automatic annotation of active regions for sport videos.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a semi-parametric Algorithm for parsing football video structures. The approach works on a two interleaved based process that closely collaborate towards a common goal. The core part of the proposed method focus perform a fast automatic football video annotation by looking at the enhance entropy variance within a series of shot frames. The entropy is extracted on the Hue parameter from the HSV color system, not as a global feature but in spatial domain to identify regions within a shot that will characterize a certain activity within the shot period. The second part of the algorithm works towards the identification of dominant color regions that could represent players and playfield for further activity recognition. Experimental Results shows that the proposed football video segmentation algorithm performs with high accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Thematization is recognized as a fundamental phenomenon in the construction of messages and texts by di erent linguistic schools. This location within a text privileges the elements that guide the reader in the orientation and interpretation of discourse at di erent levels. Thematizing a linguistic unit by locating it in the rst-initial position of a clause, paragraph, or text, confers upon it a special status: a signal of the organizational strategy which characterizes di erent text types playing a role as a variable in the distinction of registers, text types and genres. However, in spite of the importance of the study of thematization for message and textual structuring, to date there are no linguistic studies that have undertook the task of validating its aspects in a comparative manner, either for linguistic or computational purposes. This study, therefore, lls a research gap by implementing a methodology based on contrastive corpus annotation, which allows to empirically validate aspects of the phenomenon of Thematization in English and Spanish, it also seeks to develop a bilingual English-Spanish comparable corpus of newspaper texts automatically annotated with thematic features at clausal and discourse levels. The empirically validated categories (Thematic Field and its elements: Textual Theme, Interpersonal Theme, PreHead and Head) are used to annotate a larger corpus of three newspaper genres news reports, editorials and letters to the editor in terms of thematic choices. This characterization, reveals interesting results, such as the use of genre-speci c strategies in thematic position. In addition, the thesis investigates the possibility to automate the annotation of thematic features in the bilingual corpus through the development of a set of JAVA rules implemented in GATE. It also shows the e cacy of this method in comparison with the manual annotation results...

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Biology is now a “Big Data Science” thanks to technological advancements allowing the characterization of the whole macromolecular content of a cell or a collection of cells. This opens interesting perspectives, but only a small portion of this data may be experimentally characterized. From this derives the demand of accurate and efficient computational tools for automatic annotation of biological molecules. This is even more true when dealing with membrane proteins, on which my research project is focused leading to the development of two machine learning-based methods: BetAware-Deep and SVMyr. BetAware-Deep is a tool for the detection and topology prediction of transmembrane beta-barrel proteins found in Gram-negative bacteria. These proteins are involved in many biological processes and primary candidates as drug targets. BetAware-Deep exploits the combination of a deep learning framework (bidirectional long short-term memory) and a probabilistic graphical model (grammatical-restrained hidden conditional random field). Moreover, it introduced a modified formulation of the hydrophobic moment, designed to include the evolutionary information. BetAware-Deep outperformed all the available methods in topology prediction and reported high scores in the detection task. Glycine myristoylation in Eukaryotes is the binding of a myristic acid on an N-terminal glycine. SVMyr is a fast method based on support vector machines designed to predict this modification in dataset of proteomic scale. It uses as input octapeptides and exploits computational scores derived from experimental examples and mean physicochemical features. SVMyr outperformed all the available methods for co-translational myristoylation prediction. In addition, it allows (as a unique feature) the prediction of post-translational myristoylation. Both the tools here described are designed having in mind best practices for the development of machine learning-based tools outlined by the bioinformatics community. Moreover, they are made available via user-friendly web servers. All this make them valuable tools for filling the gap between sequential and annotated data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This dissertation presents a systematic and analytic overview of most of the information related to stones, minerals, and stone masonry which is found in the corpus of Plutarch of Chaeronea, combined with most of the information on metals and metalworking which is connected to the former. This survey is intended as a first step in the reconstruction of the full landscape of ‘chemical’ ideas occurring in Plutarch’s writings; accordingly, the exposition of the relevant passages, the assessment of their possible interpretations, the discussion on their implications, and their contextualization in the ancient traditions have been conducted with a special interest in the ‘mineralogical’ and ‘metallurgic’ themes developed in the frame of natural philosophy and meteorology. Although in this perspective physical etiology could have come to acquire central prominence, non-etiological information on Plutarch’s ideas on the nature and behaviour of stones and metals has been treated as equally relevant to reach a fuller understanding of how Plutarch conceptualized and visualized them in general, in- and outside the frame of philosophical explanation. Such extensive outline of Plutarch’s ideas on stones and metals is a prerequisite for an accurate inquiry into his use of the two in analogies, metaphors, and symbols: to predispose this kind of research was another aim of the present survey, and this aim has contributed to shape it; moreover, a special attention has been paid to the analysis of analogical and figurative speaking due to the nature itself of a large part of Plutarch’s references to stones and metals, which are either metaphorical, presented in close association with metaphors, or framed in analogies. Much of the information used for the present overview has been extracted —always with supporting argumentation— from the implications of such metaphors and analogies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: WGS is increasingly used as a first-line diagnostic test for patients with rare genetic diseases such as neurodevelopmental disorders (NDD). Clinical applications require a robust infrastructure to support processing, storage and analysis of WGS data. The identification and interpretation of SVs from WGS data also needs to be improved. Finally, there is a need for a prioritization system that enables downstream clinical analysis and facilitates data interpretation. Here, we present the results of a clinical application of WGS in a cohort of patients with NDD. Methods: We developed highly portable workflows for processing WGS data, including alignment, quality control, and variant calling of SNVs and SVs. A benchmark analysis of state-of-the-art SV detection tools was performed to select the most accurate combination for SV calling. A gene-based prioritization system was also implemented to support variant interpretation. Results: Using a benchmark analysis, we selected the most accurate combination of tools to improve SV detection from WGS data and build a dedicated pipeline. Our workflows were used to process WGS data from 77 NDD patient-parent families. The prioritization system supported downstream analysis and enabled molecular diagnosis in 32% of patients, 25% of which were SVs and suggested a potential diagnosis in 20% of patients, requiring further investigation to achieve diagnostic certainty. Conclusion: Our data suggest that the integration of SNVs and SVs is a main factor that increases diagnostic yield by WGS and show that the adoption of a dedicated pipeline improves the process of variant detection and interpretation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Chemical cross-linking has emerged as a powerful approach for the structural characterization of proteins and protein complexes. However, the correct identification of covalently linked (cross-linked or XL) peptides analyzed by tandem mass spectrometry is still an open challenge. Here we present SIM-XL, a software tool that can analyze data generated through commonly used cross-linkers (e.g., BS3/DSS). Our software introduces a new paradigm for search-space reduction, which ultimately accounts for its increase in speed and sensitivity. Moreover, our search engine is the first to capitalize on reporter ions for selecting tandem mass spectra derived from cross-linked peptides. It also makes available a 2D interaction map and a spectrum-annotation tool unmatched by any of its kind. We show SIM-XL to be more sensitive and faster than a competing tool when analyzing a data set obtained from the human HSP90. The software is freely available for academic use at http://patternlabforproteomics.org/sim-xl. A video demonstrating the tool is available at http://patternlabforproteomics.org/sim-xl/video. SIM-XL is the first tool to support XL data in the mzIdentML format; all data are thus available from the ProteomeXchange consortium (identifier PXD001677).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Cutaneous mycoses are common human infections among healthy and immunocompromised hosts, and the anthropophilic fungus Trichophyton rubrum is the most prevalent microorganism isolated from such clinical cases worldwide. The aim of this study was to determine the transcriptional profile of T. rubrum exposed to various stimuli in order to obtain insights into the responses of this pathogen to different environmental challenges. Therefore, we generated an expressed sequence tag (EST) collection by constructing one cDNA library and nine suppression subtractive hybridization libraries. Results: The 1388 unigenes identified in this study were functionally classified based on the Munich Information Center for Protein Sequences (MIPS) categories. The identified proteins were involved in transcriptional regulation, cellular defense and stress, protein degradation, signaling, transport, and secretion, among other functions. Analysis of these unigenes revealed 575 T. rubrum sequences that had not been previously deposited in public databases. Conclusion: In this study, we identified novel T. rubrum genes that will be useful for ORF prediction in genome sequencing and facilitating functional genome analysis. Annotation of these expressed genes revealed metabolic adaptations of T. rubrum to carbon sources, ambient pH shifts, and various antifungal drugs used in medical practice. Furthermore, challenging T. rubrum with cytotoxic drugs and ambient pH shifts extended our understanding of the molecular events possibly involved in the infectious process and resistance to antifungal drugs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cuticle renewal is a complex biological process that depends on the cross talk between hormone levels and gene expression. This study characterized the expression of two genes encoding cuticle proteins sharing the four conserved amino acid blocks of the Tweedle family, AmelTwdl1 and AmelTwdl2, and a gene encoding a cuticle peroxidase containing the Animal haem peroxidase domain, Ampxd, in the honey bee. Gene sequencing and annotation validated the formerly predicted tweedle genes, and revealed a novel gene, Ampxd, in the honey bee genome. Expression of these genes was studied in the context of the ecdysteroid-coordinated pupal-to-adult molt, and in different tissues. Higher transcript levels were detected in the integument after the ecdysteroid peak that induces apolysis, coinciding with the synthesis and deposition of the adult exoskeleton and its early differentiation. The effect of this hormone was confirmed in vivo by tying a ligature between the thorax and abdomen of early pupae to prevent the abdominal integument from coming in contact with ecdysteroids released from the prothoracic gland. This procedure impaired the natural increase in transcript levels in the abdominal integument. Both tweedle genes were expressed at higher levels in the empty gut than in the thoracic integument and trachea of pharate adults. In contrast, Ampxd transcripts were found in higher levels in the thoracic integument and trachea than in the gut. Together, the data strongly suggest that these three genes play roles in ecdysteroid-dependent exoskeleton construction and differentiation and also point to a possible role for the two tweedle genes in the formation of the cuticle (peritrophic membrane) that internally lines the gut.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Melanoma progression occurs through three major stages: radial growth phase (RGP), confined to the epidermis; vertical growth phase (VGP), when the tumor has invaded into the dermis; and metastasis. In this work, we used suppression subtractive hybridization (SSH) to investigate the molecular signature of melanoma progression, by comparing a group of metastatic cell lines with an RGP-like cell line showing characteristics of early neoplastic lesions including expression of the metastasis suppressor KISS1, lack of alpha v beta 3-integrin and low levels of RHOC. Methods: Two subtracted cDNA collections were obtained, one (RGP library) by subtracting the RGP cell line (WM1552C) cDNA from a cDNA pool from four metastatic cell lines (WM9, WM852, 1205Lu and WM1617), and the other (Met library) by the reverse subtraction. Clones were sequenced and annotated, and expression validation was done by Northern blot and RT-PCR. Gene Ontology annotation and searches in large-scale melanoma expression studies were done for the genes identified. Results: We identified 367 clones from the RGP library and 386 from the Met library, of which 351 and 368, respectively, match human mRNA sequences, representing 288 and 217 annotated genes. We confirmed the differential expression of all genes selected for validation. In the Met library, we found an enrichment of genes in the growth factors/receptor, adhesion and motility categories whereas in the RGP library, enriched categories were nucleotide biosynthesis, DNA packing/repair, and macromolecular/vesicular trafficking. Interestingly, 19% of the genes from the RGP library map to chromosome 1 against 4% of the ones from Met library. Conclusion: This study identifies two populations of genes differentially expressed between melanoma cell lines from two tumor stages and suggests that these sets of genes represent profiles of less aggressive versus metastatic melanomas. A search for expression profiles of melanoma in available expression study databases allowed us to point to a great potential of involvement in tumor progression for several of the genes identified here. A few sequences obtained here may also contribute to extend annotated mRNAs or to the identification of novel transcripts.