Biblioteca Digital

241 resultados para Textual information processing

Quality of service in flexible workflows through process constraints

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Workflow technology has delivered effectively for a large class of business processes, providing the requisite control and monitoring functions. At the same time, this technology has been the target of much criticism due to its limited ability to cope with dynamically changing business conditions which require business processes to be adapted frequently, and/or its limited ability to model business processes which cannot be entirely predefined. Requirements indicate the need for generic solutions where a balance between process control and flexibility may be achieved. In this paper we present a framework that allows the workflow to execute on the basis of a partially specified model where the full specification of the model is made at runtime, and may be unique to each instance. This framework is based on the notion of process constraints. Where as process constraints may be specified for any aspect of the workflow, such as structural, temporal, etc. our focus in this paper is on a constraint which allows dynamic selection of activities for inclusion in a given instance. We call these cardinality constraints, and this paper will discuss their specification and validation requirements.

Detecting and sorting targeting peptides wtih neural networks and support vector machines

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a composite multi-layer classifier system for predicting the subcellular localization of proteins based on their amino acid sequence. The work is an extension of our previous predictor PProwler v1.1 which is itself built upon the series of predictors SignalP and TargetP. In this study we outline experiments conducted to improve the classifier design. The major improvement came from using Support Vector machines as a "smart gate" sorting the outputs of several different targeting peptide detection networks. Our final model (PProwler v1.2) gives MCC values of 0.873 for non-plant and 0.849 for plant proteins. The model improves upon the accuracy of our previous subcellular localization predictor (PProwler v1.1) by 2% for plant data (which represents 7.5% improvement upon TargetP).

Advanced Web and Network Technologies, and Applications

Relevância:

80.00% 80.00%

Publicador:

Frontiers of WWW Research and Development: APWeb 2006: 8th Asia-Pacific Web Conference, Harbin, China, January 16-18, 2006. Proceedings

Relevância:

80.00% 80.00%

Publicador:

Fuzzy Systems and Knowledge Discovery

Relevância:

80.00% 80.00%

Publicador:

Advanced Data Mining and Applications

Relevância:

80.00% 80.00%

Publicador:

MIS Quarterly

Relevância:

80.00% 80.00%

Publicador:

Selected Papers from the Sixth IFIP 2.6 Working Conference on Visual Database Systems

Relevância:

80.00% 80.00%

Publicador:

Authorship Attribution with Support Vector Machines

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we explore the use of text-mining methods for the identification of the author of a text. We apply the support vector machine (SVM) to this problem, as it is able to cope with half a million of inputs it requires no feature selection and can process the frequency vector of all words of a text. We performed a number of experiments with texts from a German newspaper. With nearly perfect reliability the SVM was able to reject other authors and detected the target author in 60–80% of the cases. In a second experiment, we ignored nouns, verbs and adjectives and replaced them by grammatical tags and bigrams. This resulted in slightly reduced performance. Author detection with SVMs on full word forms was remarkably robust even if the author wrote about different topics.

Special issue for 2004 Annual Conference of IS/IT Issues in Asia-Pacific

Relevância:

80.00% 80.00%

Publicador:

IJDWM Special Issue: Advances in Data Mining Applications

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This special issue is a collection of the selected papers published on the proceedings of the First International Conference on Advanced Data Mining and Applications (ADMA) held in Wuhan, China in 2005. The articles focus on the innovative applications of data mining approaches to the problems that involve large data sets, incomplete and noise data, or demand optimal solutions.

PLD: A distillation algorithm for misclassified documents

Relevância:

80.00% 80.00%

Publicador:

Interactive email filtering: Learning from misclassified examples

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Learning from mistakes has proven to be an effective way of learning in the interactive document classifications. In this paper we propose an approach to effectively learning from mistakes in the email filtering process. Our system has employed both SVM and Winnow machine learning algorithms to learn from misclassified email documents and refine the email filtering process accordingly. Our experiments have shown that the training of an email filter becomes much effective and faster

A new framework of privacy preserving data sharing

Relevância:

80.00% 80.00%

Publicador:

Machine learning for matching astronomy catalogues

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An emerging issue in the field of astronomy is the integration, management and utilization of databases from around the world to facilitate scientific discovery. In this paper, we investigate application of the machine learning techniques of support vector machines and neural networks to the problem of amalgamating catalogues of galaxies as objects from two disparate data sources: radio and optical. Formulating this as a classification problem presents several challenges, including dealing with a highly unbalanced data set. Unlike the conventional approach to the problem (which is based on a likelihood ratio) machine learning does not require density estimation and is shown here to provide a significant improvement in performance. We also report some experiments that explore the importance of the radio and optical data features for the matching problem.

«
1
2
...
5
6
7
8
9
10
11
...
16
17
»