35 resultados para Annotations sémantiques


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Most approaches to business process compliance are restricted to the analysis of the structure of processes. It has been argued that full regulatory compliance requires information on not only the structure of processes but also on what the tasks in a process do. To this end Governatori and Sadiq[2007] proposed to extend business processes with semantic annotations. We propose a methodology to automatically extract one kind of such annotations; in particular the annotations related to the data schema and templates linked to the various tasks in a business process.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. Methodology/Principal Findings A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. Conclusions It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method. It is freely available at http://bioinformatics.awowshop.com/snlpr​ed_page.php.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Building and maintaining software are not easy tasks. However, thanks to advances in web technologies, a new paradigm is emerging in software development. The Service Oriented Architecture (SOA) is a relatively new approach that helps bridge the gap between business and IT and also helps systems remain exible. However, there are still several challenges with SOA. As the number of available services grows, developers are faced with the problem of discovering the services they need. Public service repositories such as Programmable Web provide only limited search capabilities. Several mechanisms have been proposed to improve web service discovery by using semantics. However, most of these require manually tagging the services with concepts in an ontology. Adding semantic annotations is a non-trivial process that requires a certain skill-set from the annotator and also the availability of domain ontologies that include the concepts related to the topics of the service. These issues have prevented these mechanisms becoming widespread. This thesis focuses on two main problems. First, to avoid the overhead of manually adding semantics to web services, several automatic methods to include semantics in the discovery process are explored. Although experimentation with some of these strategies has been conducted in the past, the results reported in the literature are mixed. Second, Wikipedia is explored as a general-purpose ontology. The benefit of using it as an ontology is assessed by comparing these semantics-based methods to classic term-based information retrieval approaches. The contribution of this research is significant because, to the best of our knowledge, a comprehensive analysis of the impact of using Wikipedia as a source of semantics in web service discovery does not exist. The main output of this research is a web service discovery engine that implements these methods and a comprehensive analysis of the benefits and trade-offs of these semantics-based discovery approaches.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tag recommendation is a specific recommendation task for recommending metadata (tag) for a web resource (item) during user annotation process. In this context, sparsity problem refers to situation where tags need to be produced for items with few annotations or for user who tags few items. Most of the state of the art approaches in tag recommendation are rarely evaluated or perform poorly under this situation. This paper presents a combined method for mitigating sparsity problem in tag recommendation by mainly expanding and ranking candidate tags based on similar items’ tags and existing tag ontology. We evaluated the approach on two public social bookmarking datasets. The experiment results show better accuracy for recommendation in sparsity situation over several state of the art methods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The design of concurrent software systems, in particular process-aware information systems, involves behavioral modeling at various stages. Recently, approaches to behavioral analysis of such systems have been based on declarative abstractions defined as sets of behavioral relations. However, these relations are typically defined in an ad-hoc manner. In this paper, we address the lack of a systematic exploration of the fundamental relations that can be used to capture the behavior of concurrent systems, i.e., co-occurrence, conflict, causality, and concurrency. Besides the definition of the spectrum of behavioral relations, which we refer to as the 4C spectrum, we also show that our relations give rise to implication lattices. We further provide operationalizations of the proposed relations, starting by proposing techniques for computing relations in unlabeled systems, which are then lifted to become applicable in the context of labeled systems, i.e., systems in which state transitions have semantic annotations. Finally, we report on experimental results on efficiency of the proposed computations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In most intent recognition studies, annotations of query intent are created post hoc by external assessors who are not the searchers themselves. It is important for the field to get a better understanding of the quality of this process as an approximation for determining the searcher's actual intent. Some studies have investigated the reliability of the query intent annotation process by measuring the interassessor agreement. However, these studies did not measure the validity of the judgments, that is, to what extent the annotations match the searcher's actual intent. In this study, we asked both the searchers themselves and external assessors to classify queries using the same intent classification scheme. We show that of the seven dimensions in our intent classification scheme, four can reliably be used for query annotation. Of these four, only the annotations on the topic and spatial sensitivity dimension are valid when compared with the searcher's annotations. The difference between the interassessor agreement and the assessor-searcher agreement was significant on all dimensions, showing that the agreement between external assessors is not a good estimator of the validity of the intent classifications. Therefore, we encourage the research community to consider using query intent classifications by the searchers themselves as test data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a novel framework for the unsupervised alignment of an ensemble of temporal sequences. This approach draws inspiration from the axiom that an ensemble of temporal signals stemming from the same source/class should have lower rank when "aligned" rather than "misaligned". Our approach shares similarities with recent state of the art methods for unsupervised images ensemble alignment (e.g. RASL) which breaks the problem into a set of image alignment problems (which have well known solutions i.e. the Lucas-Kanade algorithm). Similarly, we propose a strategy for decomposing the problem of temporal ensemble alignment into a similar set of independent sequence problems which we claim can be solved reliably through Dynamic Time Warping (DTW). We demonstrate the utility of our method using the Cohn-Kanade+ dataset, to align expression onset across multiple sequences, which allows us to automate the rapid discovery of event annotations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present findings from a field trial of CAM (Cooperative Artefact Memory) -- a mobile-tagging based messaging system -- in a design studio environment. CAM allows individuals to collaboratively store relevant information onto their physical design artefacts, such as sketches, collages, story-boards, and physical mock-ups in the form of messages, annotations and external web links. We studied the use of CAM in three student design projects. We observed that CAM facilitated new ways of collaborating in joint design projects. The serendipitous and asynchronous nature of CAM facilitated expressions of design aesthetics, allowed designers to have playful interactions, supported exploration of new design ideas, and supported designers' reflective practices. In general, our results show how CAM transformed mundane design artefacts into "living" artefacts that made the creative and playful side of cooperative design visible.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Physical design objects such as sketches, drawings, collages, storyboards and models play an important role in supporting communication and coordination in design studios. CAM (Cooperative Artefact Memory) is a mobile-tagging based messaging system that allows designers to collaboratively store relevant information onto their design objects in the form of messages, annotations and external web links. We studied the use of CAM in a Product Design studio over three weeks, involving three different design teams. In this paper, we briefly describe CAM and show how it serves as 'object memory'.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we report the results of a field trial of a Ubicomp system called CAM that is aimed at supporting and enhancing collaboration in a design studio environment. CAM uses a mobile-tagging application which allows designers to collaboratively store relevant information onto their physical design objects in the form of messages, annotations and external web links. The purpose of our field trial was to explore the role of augmented objects in supporting and enhancing creative work. Our results show that CAM was used not only to support participants' mutual awareness and coordination but also to facilitate designers in appropriating their augmented design objects to be explorative, extendable and playful, supporting creative aspects of design work. In general, our results show how CAM transformed static design objects into 'remarkable' objects that made the creative and playful side of cooperative design visible.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we provide the results of a field study of a Ubicomp system called CAM (Cooperative Artefact Memory) in a Product Design studio. CAM is a mobile-tagging based messaging system that allows designers to store relevant information onto their design artefacts in the form of messages, annotations and external web links. From our field study results, we observe that the use of CAM adds another shared ‘space’ onto these design artefacts – that are in their natural settings boundary objects themselves. In the paper, we provide several examples from the field illustrating how CAM helps in the design process.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Over the past decade the mitochondrial (mt) genome has become the most widely used genomic resource available for systematic entomology. While the availability of other types of ‘–omics’ data – in particular transcriptomes – is increasing rapidly, mt genomes are still vastly cheaper to sequence and are far less demanding of high quality templates. Furthermore, almost all other ‘–omics’ approaches also sequence the mt genome, and so it can form a bridge between legacy and contemporary datasets. Mitochondrial genomes have now been sequenced for all insect orders, and in many instances representatives of each major lineage within orders (suborders, series or superfamilies depending on the group). They have also been applied to systematic questions at all taxonomic scales from resolving interordinal relationships (e.g. Cameron et al., 2009; Wan et al., 2012; Wang et al., 2012), through many intraordinal (e.g. Dowton et al., 2009; Timmermans et al., 2010; Zhao et al. 2013a) and family-level studies (e.g. Nelson et al., 2012; Zhao et al., 2013b) to population/biogeographic studies (e.g. Ma et al., 2012). Methodological issues around the use of mt genomes in insect phylogenetic analyses and the empirical results found to date have recently been reviewed by Cameron (2014), yet the technical aspects of sequencing and annotating mt genomes were not covered. Most papers which generate new mt genome report their methods in a simplified form which can be difficult to replicate without specific knowledge of the field. Published studies utilize a sufficiently wide range of approaches, usually without justification for the one chosen, that confusion about commonly used jargon such as ‘long PCR’ and ‘primer walking’ could be a serious barrier to entry. Furthermore, sequenced mt genomes have been annotated (gene locations defined) to wildly varying standards and improving data quality through consistent annotation procedures will benefit all downstream users of these datasets. The aims of this review are therefore to: 1. Describe in detail the various sequencing methods used on insect mt genomes; 2. Explore the strengths/weakness of different approaches; 3. Outline the procedures and software used for insect mt genome annotation, and; 4. Highlight quality control steps used for new annotations, and to improve the re-annotation of previously sequenced mt genomes used in systematic or comparative research.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper reports on the 2nd ShARe/CLEFeHealth evaluation lab which continues our evaluation resource building activities for the medical domain. In this lab we focus on patients' information needs as opposed to the more common campaign focus of the specialised information needs of physicians and other healthcare workers. The usage scenario of the lab is to ease patients and next-of-kins' ease in understanding eHealth information, in particular clinical reports. The 1st ShARe/CLEFeHealth evaluation lab was held in 2013. This lab consisted of three tasks. Task 1 focused on named entity recognition and normalization of disorders; Task 2 on normalization of acronyms/abbreviations; and Task 3 on information retrieval to address questions patients may have when reading clinical reports. This year's lab introduces a new challenge in Task 1 on visual-interactive search and exploration of eHealth data. Its aim is to help patients (or their next-of-kin) in readability issues related to their hospital discharge documents and related information search on the Internet. Task 2 then continues the information extraction work of the 2013 lab, specifically focusing on disorder attribute identification and normalization from clinical text. Finally, this year's Task 3 further extends the 2013 information retrieval task, by cleaning the 2013 document collection and introducing a new query generation method and multilingual queries. De-identified clinical reports used by the three tasks were from US intensive care and originated from the MIMIC II database. Other text documents for Tasks 1 and 3 were from the Internet and originated from the Khresmoi project. Task 2 annotations originated from the ShARe annotations. For Tasks 1 and 3, new annotations, queries, and relevance assessments were created. 50, 79, and 91 people registered their interest in Tasks 1, 2, and 3, respectively. 24 unique teams participated with 1, 10, and 14 teams in Tasks 1, 2 and 3, respectively. The teams were from Africa, Asia, Canada, Europe, and North America. The Task 1 submission, reviewed by 5 expert peers, related to the task evaluation category of Effective use of interaction and targeted the needs of both expert and novice users. The best system had an Accuracy of 0.868 in Task 2a, an F1-score of 0.576 in Task 2b, and Precision at 10 (P@10) of 0.756 in Task 3. The results demonstrate the substantial community interest and capabilities of these systems in making clinical reports easier to understand for patients. The organisers have made data and tools available for future research and development.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Due to the popularity of security cameras in public places, it is of interest to design an intelligent system that can efficiently detect events automatically. This paper proposes a novel algorithm for multi-person event detection. To ensure greater than real-time performance, features are extracted directly from compressed MPEG video. A novel histogram-based feature descriptor that captures the angles between extracted particle trajectories is proposed, which allows us to capture motion patterns of multi-person events in the video. To alleviate the need for fine-grained annotation, we propose the use of Labelled Latent Dirichlet Allocation, a “weakly supervised” method that allows the use of coarse temporal annotations which are much simpler to obtain. This novel system is able to run at approximately ten times real-time, while preserving state-of-theart detection performance for multi-person events on a 100-hour real-world surveillance dataset (TRECVid SED).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Acoustic sensors allow scientists to scale environmental monitoring over large spatiotemporal scales. The faunal vocalisations captured by these sensors can answer ecological questions, however, identifying these vocalisations within recorded audio is difficult: automatic recognition is currently intractable and manual recognition is slow and error prone. In this paper, a semi-automated approach to call recognition is presented. An automated decision support tool is tested that assists users in the manual annotation process. The respective strengths of human and computer analysis are used to complement one another. The tool recommends the species of an unknown vocalisation and thereby minimises the need for the memorization of a large corpus of vocalisations. In the case of a folksonomic tagging system, recommending species tags also minimises the proliferation of redundant tag categories. We describe two algorithms: (1) a “naïve” decision support tool (16%–64% sensitivity) with efficiency of O(n) but which becomes unscalable as more data is added and (2) a scalable alternative with 48% sensitivity and an efficiency ofO(log n). The improved algorithm was also tested in a HTML-based annotation prototype. The result of this work is a decision support tool for annotating faunal acoustic events that may be utilised by other bioacoustics projects.