178 resultados para Biology, Microbiology|Biology, Bioinformatics|Biology, Virology|Computer Science
Resumo:
In Natural Language Processing (NLP) symbolic systems, several linguistic phenomena, for instance, the thematic role relationships between sentence constituents, such as AGENT, PATIENT, and LOCATION, can be accounted for by the employment of a rule-based grammar. Another approach to NLP concerns the use of the connectionist model, which has the benefits of learning, generalization and fault tolerance, among others. A third option merges the two previous approaches into a hybrid one: a symbolic thematic theory is used to supply the connectionist network with initial knowledge. Inspired on neuroscience, it is proposed a symbolic-connectionist hybrid system called BIO theta PRED (BIOlogically plausible thematic (theta) symbolic-connectionist PREDictor), designed to reveal the thematic grid assigned to a sentence. Its connectionist architecture comprises, as input, a featural representation of the words (based on the verb/noun WordNet classification and on the classical semantic microfeature representation), and, as output, the thematic grid assigned to the sentence. BIO theta PRED is designed to ""predict"" thematic (semantic) roles assigned to words in a sentence context, employing biologically inspired training algorithm and architecture, and adopting a psycholinguistic view of thematic theory.
Resumo:
This paper presents SMarty, a variability management approach for UML-based software product lines (PL). SMarty is supported by a UML profile, the SMartyProfile, and a process for managing variabilities, the SMartyProcess. SMartyProfile aims at representing variabilities, variation points, and variants in UML models by applying a set of stereotypes. SMartyProcess consists of a set of activities that is systematically executed to trace, identify, and control variabilities in a PL based on SMarty. It also identifies variability implementation mechanisms and analyzes specific product configurations. In addition, a more comprehensive application of SMarty is presented using SEI's Arcade Game Maker PL. An evaluation of SMarty and related work are discussed.
Resumo:
A planar k-restricted structure is a simple graph whose blocks are planar and each has at most k vertices. Planar k-restricted structures are used by approximation algorithms for Maximum Weight Planar Subgraph, which motivates this work. The planar k-restricted ratio is the infimum, over simple planar graphs H, of the ratio of the number of edges in a maximum k-restricted structure subgraph of H to the number edges of H. We prove that, as k tends to infinity, the planar k-restricted ratio tends to 1/2. The same result holds for the weighted version. Our results are based on analyzing the analogous ratios for outerplanar and weighted outerplanar graphs. Here both ratios tend to 1 as k goes to infinity, and we provide good estimates of the rates of convergence, showing that they differ in the weighted from the unweighted case.
Resumo:
We simplify the known formula for the asymptotic estimate of the number of deterministic and accessible automata with n states over a k-letter alphabet. The proof relies on the theory of Lagrange inversion applied in the context of generalized binomial series.
Resumo:
Objective: We carry out a systematic assessment on a suite of kernel-based learning machines while coping with the task of epilepsy diagnosis through automatic electroencephalogram (EEG) signal classification. Methods and materials: The kernel machines investigated include the standard support vector machine (SVM), the least squares SVM, the Lagrangian SVM, the smooth SVM, the proximal SVM, and the relevance vector machine. An extensive series of experiments was conducted on publicly available data, whose clinical EEG recordings were obtained from five normal subjects and five epileptic patients. The performance levels delivered by the different kernel machines are contrasted in terms of the criteria of predictive accuracy, sensitivity to the kernel function/parameter value, and sensitivity to the type of features extracted from the signal. For this purpose, 26 values for the kernel parameter (radius) of two well-known kernel functions (namely. Gaussian and exponential radial basis functions) were considered as well as 21 types of features extracted from the EEG signal, including statistical values derived from the discrete wavelet transform, Lyapunov exponents, and combinations thereof. Results: We first quantitatively assess the impact of the choice of the wavelet basis on the quality of the features extracted. Four wavelet basis functions were considered in this study. Then, we provide the average accuracy (i.e., cross-validation error) values delivered by 252 kernel machine configurations; in particular, 40%/35% of the best-calibrated models of the standard and least squares SVMs reached 100% accuracy rate for the two kernel functions considered. Moreover, we show the sensitivity profiles exhibited by a large sample of the configurations whereby one can visually inspect their levels of sensitiveness to the type of feature and to the kernel function/parameter value. Conclusions: Overall, the results evidence that all kernel machines are competitive in terms of accuracy, with the standard and least squares SVMs prevailing more consistently. Moreover, the choice of the kernel function and parameter value as well as the choice of the feature extractor are critical decisions to be taken, albeit the choice of the wavelet family seems not to be so relevant. Also, the statistical values calculated over the Lyapunov exponents were good sources of signal representation, but not as informative as their wavelet counterparts. Finally, a typical sensitivity profile has emerged among all types of machines, involving some regions of stability separated by zones of sharp variation, with some kernel parameter values clearly associated with better accuracy rates (zones of optimality). (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Today several different unsupervised classification algorithms are commonly used to cluster similar patterns in a data set based only on its statistical properties. Specially in image data applications, self-organizing methods for unsupervised classification have been successfully applied for clustering pixels or group of pixels in order to perform segmentation tasks. The first important contribution of this paper refers to the development of a self-organizing method for data classification, named Enhanced Independent Component Analysis Mixture Model (EICAMM), which was built by proposing some modifications in the Independent Component Analysis Mixture Model (ICAMM). Such improvements were proposed by considering some of the model limitations as well as by analyzing how it should be improved in order to become more efficient. Moreover, a pre-processing methodology was also proposed, which is based on combining the Sparse Code Shrinkage (SCS) for image denoising and the Sobel edge detector. In the experiments of this work, the EICAMM and other self-organizing models were applied for segmenting images in their original and pre-processed versions. A comparative analysis showed satisfactory and competitive image segmentation results obtained by the proposals presented herein. (C) 2008 Published by Elsevier B.V.
Resumo:
A model where agents show discrete behavior regarding their actions, but have continuous opinions that are updated by interacting with other agents is presented. This new updating rule is applied to both the voter and Sznajd models for interaction between neighbors, and its consequences are discussed. The appearance of extremists is naturally observed and it seems to be a characteristic of this model.
Resumo:
Introduction: Internet users are increasingly using the worldwide web to search for information relating to their health. This situation makes it necessary to create specialized tools capable of supporting users in their searches. Objective: To apply and compare strategies that were developed to investigate the use of the Portuguese version of Medical Subject Headings (MeSH) for constructing an automated classifier for Brazilian Portuguese-language web-based content within or outside of the field of healthcare, focusing on the lay public. Methods: 3658 Brazilian web pages were used to train the classifier and 606 Brazilian web pages were used to validate it. The strategies proposed were constructed using content-based vector methods for text classification, such that Naive Bayes was used for the task of classifying vector patterns with characteristics obtained through the proposed strategies. Results: A strategy named InDeCS was developed specifically to adapt MeSH for the problem that was put forward. This approach achieved better accuracy for this pattern classification task (0.94 sensitivity, specificity and area under the ROC curve). Conclusions: Because of the significant results achieved by InDeCS, this tool has been successfully applied to the Brazilian healthcare search portal known as Busca Saude. Furthermore, it could be shown that MeSH presents important results when used for the task of classifying web-based content focusing on the lay public. It was also possible to show from this study that MeSH was able to map out mutable non-deterministic characteristics of the web. (c) 2010 Elsevier Inc. All rights reserved.
Resumo:
The large amount of information in electronic contracts hampers their establishment due to high complexity. An approach inspired in Software Product Line (PL) and based on feature modelling was proposed to make this process more systematic through information reuse and structuring. By assessing the feature-based approach in relation to a proposed set of requirements, it was showed that the approach does not allow the price of services and of Quality of Services (QoS) attributes to be considered in the negotiation and included in the electronic contract. Thus, this paper also presents an extension of such approach in which prices and price types associated to Web services and QoS levels are applied. An extended toolkit prototype is also presented as well as an experiment example of the proposed approach.
Resumo:
We show that commutative group spherical codes in R(n), as introduced by D. Slepian, are directly related to flat tori and quotients of lattices. As consequence of this view, we derive new results on the geometry of these codes and an upper bound for their cardinality in terms of minimum distance and the maximum center density of lattices and general spherical packings in the half dimension of the code. This bound is tight in the sense it can be arbitrarily approached in any dimension. Examples of this approach and a comparison of this bound with Union and Rankin bounds for general spherical codes is also presented.
Resumo:
Brazilian science has increased fast during the last decades. An example is the increasing in the country`s share in the world`s scientific publication within the main international databases. But what is the actual weight of international publications to the whole Brazilian productivity? In order to respond this question, we have elaborated a new indicator, the International Publication Ratio (IPR). The data source was Lattes Database, a database organized by one of the main Brazilian S&T funding agency, which encompasses publication data from 1997 to 2004 of about 51,000 Brazilian researchers. Influences of distinct parameters, such as sectors, fields, career age and gender, are analyzed. We hope the data presented may help S&T managers and other S&T interests to better understand the complexity under the concept scientific productivity, especially in peripheral countries in science, such as Brazil.
Resumo:
An implementation of a computational tool to generate new summaries from new source texts is presented, by means of the connectionist approach (artificial neural networks). Among other contributions that this work intends to bring to natural language processing research, the use of a more biologically plausible connectionist architecture and training for automatic summarization is emphasized. The choice relies on the expectation that it may bring an increase in computational efficiency when compared to the sa-called biologically implausible algorithms.
Resumo:
Support for interoperability and interchangeability of software components which are part of a fieldbus automation system relies on the definition of open architectures, most of them involving proprietary technologies. Concurrently, standard, open and non-proprietary technologies, such as XML, SOAP, Web Services and the like, have greatly evolved and been diffused in the computing area. This article presents a FOUNDATION fieldbus (TM) device description technology named Open-EDD, based on XML and other related technologies (XLST, DOM using Xerces implementation, OO, XMIL Schema), proposing an open and nonproprietary alternative to the EDD (Electronic Device Description). This initial proposal includes defining Open-EDDML as the programming language of the technology in the FOUNDATION fieldbus (TM) protocol, implementing a compiler and a parser, and finally, integrating and testing the new technology using field devices and a commercial fieldbus configurator. This study attests that this new technology is feasible and can be applied to other configurators or HMI applications used in fieldbus automation systems. (c) 2008 Elsevier B.V. All rights reserved.
Resumo:
An experimental study of the Polarization Dependent Loss (PDL) is performed in an Optical Recirculating Loop (RCL). The RCL enables to simulate the transmission through various optical links using just one optical fiber spool, one in line amplifier, some optical filters and devices in a low cost manner. The total amount of PDL in a Recirculating loop, due to its statistical nature, is different of the simple sum of each element of the recirculating loop because of the alignment variation of the PDL elements with time, depending on the environmental conditions such as fiber stress and temperature. In this paper theoretical studies are also performed using formalism of Jones and Mueller matrices in order to represent the different optical elements in the recirculating loop. The PDL must be correctly characterized in order to evaluate properly the impact on the performance of next generation DWDM systems. Theoretical and experimental results comparison shows that a depolarization of 7% occurs in the experimental setup, probably by the optical amplifier due to the depolarized nature of the amplified spontaneous emission.
Resumo:
This paper proposes a novel computer vision approach that processes video sequences of people walking and then recognises those people by their gait. Human motion carries different information that can be analysed in various ways. The skeleton carries motion information about human joints, and the silhouette carries information about boundary motion of the human body. Moreover, binary and gray-level images contain different information about human movements. This work proposes to recover these different kinds of information to interpret the global motion of the human body based on four different segmented image models, using a fusion model to improve classification. Our proposed method considers the set of the segmented frames of each individual as a distinct class and each frame as an object of this class. The methodology applies background extraction using the Gaussian Mixture Model (GMM), a scale reduction based on the Wavelet Transform (WT) and feature extraction by Principal Component Analysis (PCA). We propose four new schemas for motion information capture: the Silhouette-Gray-Wavelet model (SGW) captures motion based on grey level variations; the Silhouette-Binary-Wavelet model (SBW) captures motion based on binary information; the Silhouette-Edge-Binary model (SEW) captures motion based on edge information and the Silhouette Skeleton Wavelet model (SSW) captures motion based on skeleton movement. The classification rates obtained separately from these four different models are then merged using a new proposed fusion technique. The results suggest excellent performance in terms of recognising people by their gait.