498 resultados para tagging
Resumo:
Enriching knowledge bases with multimedia information makes it possible to complement textual descriptions with visual and audio information. Such complementary information can help users to understand the meaning of assertions, and in general improve the user experience with the knowledge base. In this paper we address the problem of how to enrich ontology instances with candidate images retrieved from existing Web search engines. DBpedia has evolved into a major hub in the Linked Data cloud, interconnecting millions of entities organized under a consistent ontology. Our approach taps into the Wikipedia corpus to gather context information for DBpedia instances and takes advantage of image tagging information when this is available to calculate semantic relatedness between instances and candidate images. We performed experiments with focus on the particularly challenging problem of highly ambiguous names. Both methods presented in this work outperformed the baseline. Our best method leveraged context words from Wikipedia, tags from Flickr and type information from DBpedia to achieve an average precision of 80%.
Resumo:
Plant cysteine-proteases (CysProt) represent a well-characterized type of proteolytic enzymes that fulfill tightly regulated physiological functions (senescence and seed germination among others) and defense roles. This article is focused on the group of papain-proteases C1A (family C1, clan CA) and their inhibitors, phytocystatins (PhyCys). In particular, the protease–inhibitor interaction and their mutual participation in specific pathways throughout the plant's life are reviewed. C1A CysProt and PhyCys have been molecularly characterized, and comparative sequence analyses have identified consensus functional motifs. A correlation can be established between the number of identified CysProt and PhyCys in angiosperms. Thus, evolutionary forces may have determined a control role of cystatins on both endogenous and pest-exogenous proteases in these species. Tagging the proteases and inhibitors with fluorescence proteins revealed common patterns of subcellular localization in the endoplasmic reticulum–Golgi network in transiently transformed onion epidermal cells. Further in vivo interactions were demonstrated by bimolecular fluorescent complementation, suggesting their participation in the same physiological processes.
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
One of the advantages of social networks is the possibility to socialize and personalize the content created or shared by the users. In mobile social networks, where the devices have limited capabilities in terms of screen size and computing power, Multimedia Recommender Systems help to present the most relevant content to the users, depending on their tastes, relationships and profile. Previous recommender systems are not able to cope with the uncertainty of automated tagging and are knowledge domain dependant. In addition, the instantiation of a recommender in this domain should cope with problems arising from the collaborative filtering inherent nature (cold start, banana problem, large number of users to run, etc.). The solution presented in this paper addresses the abovementioned problems by proposing a hybrid image recommender system, which combines collaborative filtering (social techniques) with content-based techniques, leaving the user the liberty to give these processes a personal weight. It takes into account aesthetics and the formal characteristics of the images to overcome the problems of current techniques, improving the performance of existing systems to create a mobile social networks recommender with a high degree of adaptation to any kind of user.
Resumo:
Folksonomies emerge as the result of the free tagging activity of a large number of users over a variety of resources. They can be considered as valuable sources from which it is possible to obtain emerging vocabularies that can be leveraged in knowledge extraction tasks. However, when it comes to understanding the meaning of tags in folksonomies, several problems mainly related to the appearance of synonymous and ambiguous tags arise, specifically in the context of multilinguality. The authors aim to turn folksonomies into knowledge structures where tag meanings are identified, and relations between them are asserted. For such purpose, they use DBpedia as a general knowledge base from which they leverage its multilingual capabilities.
Resumo:
Background. Over the last years, the number of available informatics resources in medicine has grown exponentially. While specific inventories of such resources have already begun to be developed for Bioinformatics (BI), comparable inventories are as yet not available for Medical Informatics (MI) field, so that locating and accessing them currently remains a hard and time-consuming task. Description. We have created a repository of MI resources from the scientific literature, providing free access to its contents through a web-based service. Relevant information describing the resources is automatically extracted from manuscripts published in top-ranked MI journals. We used a pattern matching approach to detect the resources? names and their main features. Detected resources are classified according to three different criteria: functionality, resource type and domain. To facilitate these tasks, we have built three different taxonomies by following a novel approach based on folksonomies and social tagging. We adopted the terminology most frequently used by MI researchers in their publications to create the concepts and hierarchical relationships belonging to the taxonomies. The classification algorithm identifies the categories associated to resources and annotates them accordingly. The database is then populated with this data after manual curation and validation. Conclusions. We have created an online repository of MI resources to assist researchers in locating and accessing the most suitable resources to perform specific tasks. The database contained 282 resources at the time of writing. We are continuing to expand the number of available resources by taking into account further publications as well as suggestions from users and resource developers.
Resumo:
A real-time large scale part-to-part video matching algorithm, based on the cross correlation of the intensity of motion curves, is proposed with a view to originality recognition, video database cleansing, copyright enforcement, video tagging or video result re-ranking. Moreover, it is suggested how the most representative hashes and distance functions - strada, discrete cosine transformation, Marr-Hildreth and radial - should be integrated in order for the matching algorithm to be invariant against blur, compression and rotation distortions: (R; _) 2 [1; 20]_[1; 8], from 512_512 to 32_32pixels2 and from 10 to 180_. The DCT hash is invariant against blur and compression up to 64x64 pixels2. Nevertheless, although its performance against rotation is the best, with a success up to 70%, it should be combined with the Marr-Hildreth distance function. With the latter, the image selected by the DCT hash should be at a distance lower than 1.15 times the Marr-Hildreth minimum distance.
Resumo:
This approach aims at aligning, unifying and expanding the set of sentiment lexicons which are available on the web in order to increase their robustness of coverage. A sentiment lexicon is a critical and essential resource for tagging subjective corpora on the web or elsewhere. In many situations, the multilingual property of the sentiment lexicon is important because the writer is using two languages alternately in the same text, message or post. Our USL approach computes the unified strength of polarity of each lexical entry based on the Pearson correlation coefficient which measures how correlated lexical entries are with a value between 1 and -1, where 1 indicates that the lexical entries are perfectly correlated, 0 indicates no correlation, and -1 means they are perfectly inversely correlated and the UnifiedMetrics procedure for CPU and GPU, respectively.
Resumo:
We describe a domain ontology development approach that extracts domain terms from folksonomies and enrich them with data and vocabularies from the Linked Open Data cloud. As a result, we obtain lightweight domain ontologies that combine the emergent knowledge of social tagging systems with formal knowledge from Ontologies. In order to illustrate the feasibility of our approach, we have produced an ontology in the financial domain from tags available in Delicious, using DBpedia, OpenCyc and UMBEL as additional knowledge sources.
Resumo:
We have investigated physical distances and directions of transposition of the maize transposable element Ac in Arabidopsis thaliana. We prepared a transferred DNA (T-DNA) construct that carried a non-autonomous derivative of Ac with a site for cleavage by endonuclease I-SceI (designated dAc-I-RS element). Another cleavage site was also introduced into the T-DNA region outside dAc-I-RS. Three transgenic Arabidopsis plants were generated, each of which had a single copy of the T-DNA at a different chromosomal location. These transgenic plants were crossed with the Arabidopsis that carried the gene for Ac transposase and progeny in which dAc-I-RS had been transposed were isolated. After digestion of the genomic DNA of these progeny with endonuclease I-SceI, sizes of segment of DNA were determined by pulse-field gel electrophoresis. We also performed linkage analysis for the transposed elements and sites of mutations near the elements. Our results showed that 50% of all transposition events had occurred within 1,700 kb on the same chromosome, with 35% within 200 kb, and that the elements transposed in both directions on the chromosome with roughly equal probability. The data thus indicate that the Ac–Ds system is most useful for tagging of genes that are present within 200 kb of the chromosomal site of Ac in Arabidopsis. In addition, determination of the precise localization of the transposed dAc-I-RS element should definitely assist in map-based cloning of genes around insertion sites.
Resumo:
Abscisic acid (ABA), an apocarotenoid synthesized from cleavage of carotenoids, regulates seed maturation and stress responses in plants. The viviparous seed mutants of maize identify genes involved in synthesis and perception of ABA. Two alleles of a new mutant, viviparous14 (vp14), were identified by transposon mutagenesis. Mutant embryos had normal sensitivity to ABA, and detached leaves of mutant seedlings showed markedly higher rates of water loss than those of wild type. The ABA content of developing mutant embryos was 70% lower than that of wild type, indicating a defect in ABA biosynthesis. vp14 embryos were not deficient in epoxy-carotenoids, and extracts of vp14 embryos efficiently converted the carotenoid cleavage product, xanthoxin, to ABA, suggesting a lesion in the cleavage reaction. vp14 was cloned by transposon tagging. The VP14 protein sequence is similar to bacterial lignostilbene dioxygenases (LSD). LSD catalyzes a double-bond cleavage reaction that is closely analogous to the carotenoid cleavage reaction of ABA biosynthesis. Southern blots indicated a family of four to six related genes in maize. The Vp14 mRNA is expressed in embryos and roots and is strongly induced in leaves by water stress. A family of Vp14-related genes evidently controls the first committed step of ABA biosynthesis. These genes are likely to play a key role in the developmental and environmental control of ABA synthesis in plants.
Resumo:
Tobacco etch virus (TEV) protease recognizes a 7-aa consensus sequence, Glu-Xaa-Xaa-Tyr-Xaa-Gln-Ser, where Xaa can be almost any amino acyl residue. Cleavage occurs between the conserved Gln and Ser residues. Because of its distinct specificity, TEV protease can be expressed in the cytoplasm without interfering with viability. Polypeptides that are not natural substrates of TEV protease are proteolyzed if they carry the appropriate cleavage site. Thus, this protease can be used to study target proteins in their natural environment in vivo, as well as in vitro. We describe two Tn5-based mini-transposons that insert TEV protease cleavage sites at random into target proteins. TnTIN introduces TEV cleavage sites into cytoplasmic proteins. TnTAP facilitates the same operation for proteins localized to the bacterial cell envelope. By using two different target proteins, SecA and TolC, we show that such modified proteins can be cleaved in vivo and in vitro by TEV protease. Possible applications of the site-specific proteolysis approach are topological studies of soluble as well as of inner and outer membrane proteins, protein inactivation, insertion mutagenesis experiments, and protein tagging.
Resumo:
Nuclear pore complexes (NPCs) are large proteinaceous portals for exchanging macromolecules between the nucleus and the cytoplasm. Revealing how this transport apparatus is assembled will be critical for understanding the nuclear transport mechanism. To address this issue and to identify factors that regulate NPC formation and dynamics, a novel fluorescence-based strategy was used. This approach is based on the functional tagging of NPC proteins with the green fluorescent protein (GFP), and the hypothesis that NPC assembly mutants will have distinct GFP-NPC signals as compared with wild-type (wt) cells. By fluorescence-activated cell sorting for cells with low GFP signal from a population of mutagenized cells expressing GFP-Nup49p, three complementation groups were identified: two correspond to mutant nup120 and gle2 alleles that result in clusters of NPCs. Interestingly, a third group was a novel temperature-sensitive allele of nup57. The lowered GFP-Nup49p incorporation in the nup57-E17 cells resulted in a decreased fluorescence level, which was due in part to a sharply diminished interaction between the carboxy-terminal truncated nup57pE17 and wt Nup49p. Interestingly, the nup57-E17 mutant also affected the incorporation of a specific subset of other nucleoporins into the NPC. Decreased levels of NPC-associated Nsp1p and Nup116p were observed. In contrast, the localizations of Nic96p, Nup82p, Nup159p, Nup145p, and Pom152p were not markedly diminished. Coincidentally, nuclear import capacity was inhibited. Taken together, the identification of such mutants with specific perturbations of NPC structure validates this fluorescence-based strategy as a powerful approach for providing insight into the mechanism of NPC biogenesis.
Resumo:
In higher eukaryotic cells, the spindle forms along with chromosome condensation in mitotic prophase. In metaphase, chromosomes are aligned on the spindle with sister kinetochores facing toward the opposite poles. In anaphase A, sister chromatids separate from each other without spindle extension, whereas spindle elongation takes place during anaphase B. We have critically examined whether such mitotic stages also occur in a lower eukaryote, Schizosaccharomyces pombe. Using the green fluorescent protein tagging technique, early mitotic to late anaphase events were observed in living fission yeast cells. S. pombe has three phases in spindle dynamics, spindle formation (phase 1), constant spindle length (phase 2), and spindle extension (phase 3). Sister centromere separation (anaphase A) rapidly occurred at the end of phase 2. The centromere showed dynamic movements throughout phase 2 as it moved back and forth and was transiently split in two before its separation, suggesting that the centromere was positioned in a bioriented manner toward the poles at metaphase. Microtubule-associating Dis1 was required for the occurrence of constant spindle length and centromere movement in phase 2. Normal transition from phase 2 to 3 needed DNA topoisomerase II and Cut1 but not Cut14. The duration of each phase was highly dependent on temperature.
Resumo:
We have found conditions for saturation mutagenesis by restriction enzyme mediated integration that result in plasmid tagging of disrupted genes. Using this method we selected for mutations in genes that act at checkpoints downstream of the intercellular signaling system that controls encapsulation in Dictyostelium discoideum. One of these genes, mkcA, is a member of the mitogen-activating protein kinase cascade family while the other, regA, is a novel bipartite gene homologous to response regulators in one part and to cyclic nucleotide phosphodiesterases in the other part. Disruption of either of these genes results in partial suppression of the block to spore formation resulting from the loss of the prestalk genes, tagB and tagC. The products of the tag genes have conserved domains of serine proteases attached to ATP-driven transporters, suggesting that they process and export peptide signals. Together, these genes outline an intercellular communication system that coordinates organismal shape with cellular differentiation during development.