584 resultados para Annotations


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The ongoing growth of the World Wide Web, catalyzed by the increasing possibility of ubiquitous access via a variety of devices, continues to strengthen its role as our prevalent information and commmunication medium. However, although tools like search engines facilitate retrieval, the task of finally making sense of Web content is still often left to human interpretation. The vision of supporting both humans and machines in such knowledge-based activities led to the development of different systems which allow to structure Web resources by metadata annotations. Interestingly, two major approaches which gained a considerable amount of attention are addressing the problem from nearly opposite directions: On the one hand, the idea of the Semantic Web suggests to formalize the knowledge within a particular domain by means of the "top-down" approach of defining ontologies. On the other hand, Social Annotation Systems as part of the so-called Web 2.0 movement implement a "bottom-up" style of categorization using arbitrary keywords. Experience as well as research in the characteristics of both systems has shown that their strengths and weaknesses seem to be inverse: While Social Annotation suffers from problems like, e. g., ambiguity or lack or precision, ontologies were especially designed to eliminate those. On the contrary, the latter suffer from a knowledge acquisition bottleneck, which is successfully overcome by the large user populations of Social Annotation Systems. Instead of being regarded as competing paradigms, the obvious potential synergies from a combination of both motivated approaches to "bridge the gap" between them. These were fostered by the evidence of emergent semantics, i. e., the self-organized evolution of implicit conceptual structures, within Social Annotation data. While several techniques to exploit the emergent patterns were proposed, a systematic analysis - especially regarding paradigms from the field of ontology learning - is still largely missing. This also includes a deeper understanding of the circumstances which affect the evolution processes. This work aims to address this gap by providing an in-depth study of methods and influencing factors to capture emergent semantics from Social Annotation Systems. We focus hereby on the acquisition of lexical semantics from the underlying networks of keywords, users and resources. Structured along different ontology learning tasks, we use a methodology of semantic grounding to characterize and evaluate the semantic relations captured by different methods. In all cases, our studies are based on datasets from several Social Annotation Systems. Specifically, we first analyze semantic relatedness among keywords, and identify measures which detect different notions of relatedness. These constitute the input of concept learning algorithms, which focus then on the discovery of synonymous and ambiguous keywords. Hereby, we assess the usefulness of various clustering techniques. As a prerequisite to induce hierarchical relationships, our next step is to study measures which quantify the level of generality of a particular keyword. We find that comparatively simple measures can approximate the generality information encoded in reference taxonomies. These insights are used to inform the final task, namely the creation of concept hierarchies. For this purpose, generality-based algorithms exhibit advantages compared to clustering approaches. In order to complement the identification of suitable methods to capture semantic structures, we analyze as a next step several factors which influence their emergence. Empirical evidence is provided that the amount of available data plays a crucial role for determining keyword meanings. From a different perspective, we examine pragmatic aspects by considering different annotation patterns among users. Based on a broad distinction between "categorizers" and "describers", we find that the latter produce more accurate results. This suggests a causal link between pragmatic and semantic aspects of keyword annotation. As a special kind of usage pattern, we then have a look at system abuse and spam. While observing a mixed picture, we suggest that an individual decision should be taken instead of disregarding spammers as a matter of principle. Finally, we discuss a set of applications which operationalize the results of our studies for enhancing both Social Annotation and semantic systems. These comprise on the one hand tools which foster the emergence of semantics, and on the one hand applications which exploit the socially induced relations to improve, e. g., searching, browsing, or user profiling facilities. In summary, the contributions of this work highlight viable methods and crucial aspects for designing enhanced knowledge-based services of a Social Semantic Web.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a type-based approach to statically derive symbolic closed-form formulae that characterize the bounds of heap memory usages of programs written in object-oriented languages. Given a program with size and alias annotations, our inference system will compute the amount of memory required by the methods to execute successfully as well as the amount of memory released when methods return. The obtained analysis results are useful for networked devices with limited computational resources as well as embedded software.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Wednesday 9th April 2014 Speaker(s): Guus Schreiber Time: 09/04/2014 11:00-11:50 Location: B32/3077 File size: 546Mb Abstract In this talk I will discuss linked data for museums, archives and libraries. This area is known for its knowledge-rich and heterogeneous data landscape. The objects in this field range from old manuscripts to recent TV programs. Challenges in this field include common metadata schema's, inter-linking of the omnipresent vocabularies, cross-collection search strategies, user-generated annotations and object-centric versus event-centric views of data. This work can be seen as part of the rapidly evolving field of digital humanities. Speaker Biography Guus Schreiber Guus is a professor of Intelligent Information Systems at the Department of Computer Science at VU University Amsterdam. Guus’ research interests are mainly in knowledge and ontology engineering with a special interest for applications in the field of cultural heritage. He was one of the key developers of the CommonKADS methodology. Guus acts as chair of W3C groups for Semantic Web standards such as RDF, OWL, SKOS and REFa. His research group is involved in a wide range of national and international research projects. He is now project coordinator of the EU Integrated project No Tube concerned with integration of Web and TV data with the help of semantics and was previously Scientific Director of the EU Network of Excellence “Knowledge Web”.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This talk will present an overview of the ongoing ERCIM project SMARTDOCS (SeMAntically-cReaTed DOCuments) which aims at automatically generating webpages from RDF data. It will particularly focus on the current issues and the investigated solutions in the different modules of the project, which are related to document planning, natural language generation and multimedia perspectives. The second part of the talk will be dedicated to the KODA annotation system, which is a knowledge-base-agnostic annotator designed to provide the RDF annotations required in the document generation process.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Las bacterias de los géneros Raoultella y Klebsiella son patógenos oportunistas para las cuales no existe un sistema uniforme de clasificación taxonómica internacional. En el presente estudio se propone una filogenia molecular basada en el gen ribosomal 16S (ADNr 16S) y el gen codificante de la subunidad de la ARN polimerasa (rpoB) de los géneros Klebsiella y Raoultella con el fin de establecer relaciones evolutivas entre dichos géneros. Los resultados evidencian una agrupación acorde con la taxonomía y las propiedades bioquímicas características, reportadas en el Genbank. Se estableció una bifurcación en los árboles, lo cual confirma la separación de los géneros Klebsiella y Raoultella. Adicionalmente, se confirmó el carácter polifilético de K. aerogenes por el gen ADNr 16S y la agrupación de R. terrigena y K. oxytoca de acuerdo con el gen rpoB. La comparación entre los árboles obtenidos permitió determinar relaciones evolutivas entre las especies, a partir de los genes evaluados, lo cual refleja cambios aparentes a nivel taxonómico y corrobora la importancia del análisis a nivel de multilocus. Este tipo de estudios permite monitorear la estabilidad de los genotipos microbianos sobre la escala temporal y espacial, mejorar la precisión de las anotaciones taxonómicas (mejor descripción de taxones o subdivisiones genéticas) y evaluar la diversidad genética y adaptabilidad en términos de virulencia o resistenciaa drogas.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a case study that explores the advantages that can be derived from the use of a design support system during the design of wastewater treatment plants (WWTP). With this objective in mind a simplified but plausible WWTP design case study has been generated with KBDS, a computer-based support system that maintains a historical record of the design process. The study shows how, by employing such a historical record, it is possible to: (1) rank different design proposals responding to a design problem; (2) study the influence of changing the weight of the arguments used in the selection of the most adequate proposal; (3) take advantage of keywords to assist the designer in the search of specific items within the historical records; (4) evaluate automatically the compliance of alternative design proposals with respect to the design objectives; (5) verify the validity of previous decisions after the modification of the current constraints or specifications; (6) re-use the design records when upgrading an existing WWTP or when designing similar facilities; (7) generate documentation of the decision making process; and (8) associate a variety of documents as annotations to any component in the design history. The paper also shows one possible future role of design support systems as they outgrow their current reactive role as repositories of historical information and start to proactively support the generation of new knowledge during the design process

Relevância:

10.00% 10.00%

Publicador:

Resumo:

S'estudia l'obra filològica d' Antoni de Bastero i Lledó (1675-1737), des d'una perspectiva de conjunt, per tal de concretar I'activitat d'aquest estudiós en els camps de la lingüística, la filologia o la crítica literària, i fer-ne una valoració adequada als coneixements actuals sobre I'exercici d'aquestes disciplines durant la primera meitat del segle XVIII. La tesi inclou un estudi biogràfic, absolutament necessari per establir moltes de les circumstancies vitals del canonge Bastero, que ens resultaven obscures i que són decisives per explicar el propi interès per la filologia, les relacions amb determinats cercles acadèmics, la datació aproximada dels diversos projectes iniciats, la interpretació correcta de la seva activitat. S'inclou, així mateix, un catàleg exhaustiu de tots els manuscrits conservats d'Antoni de Bastero i que tenen alguna relació amb el seu treball filològic. En total es tenen en compte 69 volums manuscrits, actualment escampats per diversos arxius i biblioteques de Barcelona i Girona, alguns dels quals eren fins ara desconeguts. D'aquests 69 volums, 48 contenen pròpiament obres de Bastero o altres materials publicables, i la resta són materials de treball. En conseqüència, l' obra filològica del canonge es pot concretar en: la producció d'una gramàtica italiana i d'una gramàtica francesa, en català, que va deixar inacabades; la realització de La Crusca provenzale, un magne diccionari etimològic i d'autoritats que recull una gran quantitat d'hipotètics provençalismes italians -només es va publicar el primer volum d'aquesta obra a Roma, l'any 1724, però n'he localitzat pràcticament tot el contingut; l'elaboració d'una extensa antologia de poesies trobadoresques, copiades amb gran rigor d'alguns còdexs de la Biblioteca Vaticana; el plantejament d'una Història de llengua catalana, que havia de ser una gran compilació dels mèrits i les excel·lències d'aquesta llengua -que l'autor identifica amb la provençal- i la seva literatura, i que es va poder desenvolupar nomes de forma parcial. Precisament, la part central de la tesi l'ocupa l'estudi particular i l'edició crítica de les parts redactades d'aquesta obra, que suposa la concreció de la particular percepció lingüística i literària que Bastero havia anat perfilant al llarg dels seus anys d'estudi. Es tracta d'una edició molt complexa, perquè l'obra ens ha arribat només en un esborrany, que presenta múltiples correccions i esmenes i evidencia diferents estadis redaccionals; els manuscrits inclouen, així mateix, nombrosos papers amb anotacions o fragments que, o no pertanyen al cos de l'obra, o bé s'han hagut de resituar en el lloc que els correspon. EI resultat és, tanmateix, un text prou coherent que comprèn quasi la totalitat del Llibre primer -sobre l'origen, el naixement i els diversos noms de la llengua, i sobre el nom de Catalunya- i un capítol del Llibre tercer -sobre la primitiva extensió del català per tot Espanya. EI més rellevant d'aquesta obra és el fet que s'hi basteix una original teoria sobre la formació de les diverses llengües romàniques que té el català com a eix central -proposa la identificació del català provençal amb la lingua romana dels documents alt medievals, en una operació que s'avança quasi cent anys a François Raynouard, que propugnava això mateix, referint-se nomes al provençal, amb un àmplia aprovació de la comunitat científica del seu temps. Destaquen també un excepcional rigor històric i documental, i una notable sensibilitat vers l'oralitat lingüística, que és objecte d'algunes anotacions ben interessants. Tanquen la tesi un seguit d'annexos documentals on es transcriuen diversos documents relacionats amb els aspectes tractats anteriorment.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this work a new method for clustering and building a topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called “housekeeping genes”. The proposed method generates topographic maps of the bacteria taxonomy, where relations among different type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria class by means of the 16S rRNA housekeeping gene. Complete sequences of the gene have been retrieved from the NCBI public database. In the experimental tests the maps show clusters of homologous type strains and present some singular cases potentially due to incorrect classification or erroneous annotations in the database.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Motivation: There is a frequent need to apply a large range of local or remote prediction and annotation tools to one or more sequences. We have created a tool able to dispatch one or more sequences to assorted services by defining a consistent XML format for data and annotations. Results: By analyzing annotation tools, we have determined that annotations can be described using one or more of the six forms of data: numeric or textual annotation of residues, domains (residue ranges) or whole sequences. With this in mind, XML DTDs have been designed to store the input and output of any server. Plug-in wrappers to a number of services have been written which are called from a master script. The resulting APATML is then formatted for display in HTML. Alternatively further tools may be written to perform post-analysis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power.In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible. RESULTS: We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains. CONCLUSION: This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An automatic method for recognizing natively disordered regions from amino acid sequence is described and benchmarked against predictors that were assessed at the latest critical assessment of techniques for protein structure prediction (CASP) experiment. The method attains a Wilcoxon score of 90.0, which represents a statistically significant improvement on the methods evaluated on the same targets at CASP. The classifier, DISOPRED2, was used to estimate the frequency of native disorder in several representative genomes from the three kingdoms of life. Putative, long (>30 residue) disordered segments are found to occur in 2.0% of archaean, 4.2% of eubacterial and 33.0% of eukaryotic proteins. The function of proteins with long predicted regions of disorder was investigated using the gene ontology annotations supplied with the Saccharomyces genome database. The analysis of the yeast proteome suggests that proteins containing disorder are often located in the cell nucleus and are involved in the regulation of transcription and cell signalling. The results also indicate that native disorder is associated with the molecular functions of kinase activity and nucleic acid binding.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Genomic Threading Database currently contains structural annotations for the genomes of over 100 recently sequenced organisms. Annotations are carried out by using our modified GenTHREADER software and through implementing grid technology.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract Background: The analysis of the Auditory Brainstem Response (ABR) is of fundamental importance to the investigation of the auditory system behaviour, though its interpretation has a subjective nature because of the manual process employed in its study and the clinical experience required for its analysis. When analysing the ABR, clinicians are often interested in the identification of ABR signal components referred to as Jewett waves. In particular, the detection and study of the time when these waves occur (i.e., the wave latency) is a practical tool for the diagnosis of disorders affecting the auditory system. Significant differences in inter-examiner results may lead to completely distinct clinical interpretations of the state of the auditory system. In this context, the aim of this research was to evaluate the inter-examiner agreement and variability in the manual classification of ABR. Methods: A total of 160 ABR data samples were collected, for four different stimulus intensity (80dBHL, 60dBHL, 40dBHL and 20dBHL), from 10 normal-hearing subjects (5 men and 5 women, from 20 to 52 years). Four examiners with expertise in the manual classification of ABR components participated in the study. The Bland-Altman statistical method was employed for the assessment of inter-examiner agreement and variability. The mean, standard deviation and error for the bias, which is the difference between examiners’ annotations, were estimated for each pair of examiners. Scatter plots and histograms were employed for data visualization and analysis. Results: In most comparisons the differences between examiner’s annotations were below 0.1 ms, which is clinically acceptable. In four cases, it was found a large error and standard deviation (>0.1 ms) that indicate the presence of outliers and thus, discrepancies between examiners. Conclusions: Our results quantify the inter-examiner agreement and variability of the manual analysis of ABR data, and they also allows for the determination of different patterns of manual ABR analysis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

For users of climate services, the ability to quickly determine the datasets that best fit one's needs would be invaluable. The volume, variety and complexity of climate data makes this judgment difficult. The ambition of CHARMe ("Characterization of metadata to enable high-quality climate services") is to give a wider interdisciplinary community access to a range of supporting information, such as journal articles, technical reports or feedback on previous applications of the data. The capture and discovery of this "commentary" information, often created by data users rather than data providers, and currently not linked to the data themselves, has not been significantly addressed previously. CHARMe applies the principles of Linked Data and open web standards to associate, record, search and publish user-derived annotations in a way that can be read both by users and automated systems. Tools have been developed within the CHARMe project that enable annotation capability for data delivery systems already in wide use for discovering climate data. In addition, the project has developed advanced tools for exploring data and commentary in innovative ways, including an interactive data explorer and comparator ("CHARMe Maps") and a tool for correlating climate time series with external "significant events" (e.g. instrument failures or large volcanic eruptions) that affect the data quality. Although the project focuses on climate science, the concepts are general and could be applied to other fields. All CHARMe system software is open-source, released under a liberal licence, permitting future projects to re-use the source code as they wish.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The slow-growing genus Bradyrhizobium is biologically important in soils, with different representatives found to perform a range of biochemical functions including photosynthesis, induction of root nodules and symbiotic nitrogen fixation and denitrification. Consequently, the role of the genus in soil ecology and biogeochemical transformations is of agricultural and environmental significance. Some isolates of Bradyrhizobium have been shown to be non-symbiotic and do not possess the ability to form nodules. Here we present the genome and gene annotations of two such free-living Bradyrhizobium isolates, named G22 and BF49, from soils with differing long-term management regimes (grassland and bare fallow respectively) in addition to carbon metabolism analysis. These Bradyrhizobium isolates are the first to be isolated and sequenced from European soil and are the first free-living Bradyrhizobium isolates, lacking both nodulation and nitrogen fixation genes, to have their genomes sequenced and assembled from cultured samples. The G22 and BF49 genomes are distinctly different with respect to size and number of genes; the grassland isolate also contains a plasmid. There are also a number of functional differences between these isolates and other published genomes, suggesting that this ubiquitous genus is extremely heterogeneous and has roles within the community not including symbiotic nitrogen fixation.