840 resultados para Object-oriented methods (Computer science)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identifying the correct sense of a word in context is crucial for many tasks in natural language processing (machine translation is an example). State-of-the art methods for Word Sense Disambiguation (WSD) build models using hand-crafted features that usually capturing shallow linguistic information. Complex background knowledge, such as semantic relationships, are typically either not used, or used in specialised manner, due to the limitations of the feature-based modelling techniques used. On the other hand, empirical results from the use of Inductive Logic Programming (ILP) systems have repeatedly shown that they can use diverse sources of background knowledge when constructing models. In this paper, we investigate whether this ability of ILP systems could be used to improve the predictive accuracy of models for WSD. Specifically, we examine the use of a general-purpose ILP system as a method to construct a set of features using semantic, syntactic and lexical information. This feature-set is then used by a common modelling technique in the field (a support vector machine) to construct a classifier for predicting the sense of a word. In our investigation we examine one-shot and incremental approaches to feature-set construction applied to monolingual and bilingual WSD tasks. The monolingual tasks use 32 verbs and 85 verbs and nouns (in English) from the SENSEVAL-3 and SemEval-2007 benchmarks; while the bilingual WSD task consists of 7 highly ambiguous verbs in translating from English to Portuguese. The results are encouraging: the ILP-assisted models show substantial improvements over those that simply use shallow features. In addition, incremental feature-set construction appears to identify smaller and better sets of features. Taken together, the results suggest that the use of ILP with diverse sources of background knowledge provide a way for making substantial progress in the field of WSD.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Process scheduling techniques consider the current load situation to allocate computing resources. Those techniques make approximations such as the average of communication, processing, and memory access to improve the process scheduling, although processes may present different behaviors during their whole execution. They may start with high communication requirements and later just processing. By discovering how processes behave over time, we believe it is possible to improve the resource allocation. This has motivated this paper which adopts chaos theory concepts and nonlinear prediction techniques in order to model and predict process behavior. Results confirm the radial basis function technique which presents good predictions and also low processing demands show what is essential in a real distributed environment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Document engineering is the computer science discipline that investigates systems for documents in any form and in all media. As with the relationship between software engineering and software, document engineering is concerned with principles, tools and processes that improve our ability to create, manage, and maintain documents (http://www.documentengineering.org). The ACM Symposium on Document Engineering is an annual meeting of researchers active in document engineering: it is sponsored by ACM by means of the ACM SIGWEB Special Interest Group. In this editorial, we first point to work carried out in the context of document engineering, which are directly related to multimedia tools and applications. We conclude with a summary of the papers presented in this special issue.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The pervasive and ubiquitous computing has motivated researches on multimedia adaptation which aims at matching the video quality to the user needs and device restrictions. This technique has a high computational cost which needs to be studied and estimated when designing architectures and applications. This paper presents an analytical model to quantify these video transcoding costs in a hardware independent way. The model was used to analyze the impact of transcoding delays in end-to-end live-video transmissions over LANs, MANs and WANs. Experiments confirm that the proposed model helps to define the best transcoding architecture for different scenarios.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modern database applications are increasingly employing database management systems (DBMS) to store multimedia and other complex data. To adequately support the queries required to retrieve these kinds of data, the DBMS need to answer similarity queries. However, the standard structured query language (SQL) does not provide effective support for such queries. This paper proposes an extension to SQL that seamlessly integrates syntactical constructions to express similarity predicates to the existing SQL syntax and describes the implementation of a similarity retrieval engine that allows posing similarity queries using the language extension in a relational DBM. The engine allows the evaluation of every aspect of the proposed extension, including the data definition language and data manipulation language statements, and employs metric access methods to accelerate the queries. Copyright (c) 2008 John Wiley & Sons, Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper tackles the problem of showing that evolutionary algorithms for fuzzy clustering can be more efficient than systematic (i.e. repetitive) approaches when the number of clusters in a data set is unknown. To do so, a fuzzy version of an Evolutionary Algorithm for Clustering (EAC) is introduced. A fuzzy cluster validity criterion and a fuzzy local search algorithm are used instead of their hard counterparts employed by EAC. Theoretical complexity analyses for both the systematic and evolutionary algorithms under interest are provided. Examples with computational experiments and statistical analyses are also presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Techniques devoted to generating triangular meshes from intensity images either take as input a segmented image or generate a mesh without distinguishing individual structures contained in the image. These facts may cause difficulties in using such techniques in some applications, such as numerical simulations. In this work we reformulate a previously developed technique for mesh generation from intensity images called Imesh. This reformulation makes Imesh more versatile due to an unified framework that allows an easy change of refinement metric, rendering it effective for constructing meshes for applications with varied requirements, such as numerical simulation and image modeling. Furthermore, a deeper study about the point insertion problem and the development of geometrical criterion for segmentation is also reported in this paper. Meshes with theoretical guarantee of quality can also be obtained for each individual image structure as a post-processing step, a characteristic not usually found in other methods. The tests demonstrate the flexibility and the effectiveness of the approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The substitution of missing values, also called imputation, is an important data preparation task for many domains. Ideally, the substitution of missing values should not insert biases into the dataset. This aspect has been usually assessed by some measures of the prediction capability of imputation methods. Such measures assume the simulation of missing entries for some attributes whose values are actually known. These artificially missing values are imputed and then compared with the original values. Although this evaluation is useful, it does not allow the influence of imputed values in the ultimate modelling task (e.g. in classification) to be inferred. We argue that imputation cannot be properly evaluated apart from the modelling task. Thus, alternative approaches are needed. This article elaborates on the influence of imputed values in classification. In particular, a practical procedure for estimating the inserted bias is described. As an additional contribution, we have used such a procedure to empirically illustrate the performance of three imputation methods (majority, naive Bayes and Bayesian networks) in three datasets. Three classifiers (decision tree, naive Bayes and nearest neighbours) have been used as modelling tools in our experiments. The achieved results illustrate a variety of situations that can take place in the data preparation practice.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper aims at identifying some of the key factors in adopting an organization-wide software reuse program. The factors are derived from practical experience reported by industry professionals, through a survey involving 57 Brazilian small, medium and large software organizations. Some of them produce software with commonality between applications, and have mature processes, while others successfully achieved reuse through isolated, ad hoe efforts. The paper compiles the answers from the survey participants, showing which factors were more associated with reuse success. Based on this relationship, a guide is presented, pointing out which factors should be more strongly considered by small, medium and large organizations attempting to establish a reuse program. (C) 2007 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The problem of projecting multidimensional data into lower dimensions has been pursued by many researchers due to its potential application to data analyses of various kinds. This paper presents a novel multidimensional projection technique based on least square approximations. The approximations compute the coordinates of a set of projected points based on the coordinates of a reduced number of control points with defined geometry. We name the technique Least Square Projections ( LSP). From an initial projection of the control points, LSP defines the positioning of their neighboring points through a numerical solution that aims at preserving a similarity relationship between the points given by a metric in mD. In order to perform the projection, a small number of distance calculations are necessary, and no repositioning of the points is required to obtain a final solution with satisfactory precision. The results show the capability of the technique to form groups of points by degree of similarity in 2D. We illustrate that capability through its application to mapping collections of textual documents from varied sources, a strategic yet difficult application. LSP is faster and more accurate than other existing high-quality methods, particularly where it was mostly tested, that is, for mapping text sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The literature reports research efforts allowing the editing of interactive TV multimedia documents by end-users. In this article we propose complementary contributions relative to end-user generated interactive video, video tagging, and collaboration. In earlier work we proposed the watch-and-comment (WaC) paradigm as the seamless capture of an individual`s comments so that corresponding annotated interactive videos be automatically generated. As a proof of concept, we implemented a prototype application, the WACTOOL, that supports the capture of digital ink and voice comments over individual frames and segments of the video, producing a declarative document that specifies both: different media stream structure and synchronization. In this article, we extend the WaC paradigm in two ways. First, user-video interactions are associated with edit commands and digital ink operations. Second, focusing on collaboration and distribution issues, we employ annotations as simple containers for context information by using them as tags in order to organize, store and distribute information in a P2P-based multimedia capture platform. We highlight the design principles of the watch-and-comment paradigm, and demonstrate related results including the current version of the WACTOOL and its architecture. We also illustrate how an interactive video produced by the WACTOOL can be rendered in an interactive video environment, the Ginga-NCL player, and include results from a preliminary evaluation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we describe and evaluate a geometric mass-preserving redistancing procedure for the level set function on general structured grids. The proposed algorithm is adapted from a recent finite element-based method and preserves the mass by means of a localized mass correction. A salient feature of the scheme is the absence of adjustable parameters. The algorithm is tested in two and three spatial dimensions and compared with the widely used partial differential equation (PDE)-based redistancing method using structured Cartesian grids. Through the use of quantitative error measures of interest in level set methods, we show that the overall performance of the proposed geometric procedure is better than PDE-based reinitialization schemes, since it is more robust with comparable accuracy. We also show that the algorithm is well-suited for the highly stretched curvilinear grids used in CFD simulations. Copyright (C) 2010 John Wiley & Sons, Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work presents a numerical method suitable for the study of the development of internal boundary layers (IBL) and their characteristics for flows over various types of coastal cliffs. The IBL is an important meteorological occurrence for flows with surface roughness and topographical step changes. A two-dimensional flow program was used for this study. The governing equations were written using the vorticity-velocity formulation. The spatial derivatives were discretized by high-order compact finite differences schemes. The time integration was performed with a low storage fourth-order Runge-Kutta scheme. The coastal cliff (step) was specified through an immersed boundary method. The validation of the code was done by comparison of the results with experimental and observational data. The numerical simulations were carried out for different coastal cliff heights and inclinations. The results show that the predominant factors for the height of the IBL and its characteristics are the upstream velocity, and the height and form (inclination) of the coastal cliff. Copyright (C) 2010 John Wiley & Sons, Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this paper is to develop a Bayesian analysis for nonlinear regression models under scale mixtures of skew-normal distributions. This novel class of models provides a useful generalization of the symmetrical nonlinear regression models since the error distributions cover both skewness and heavy-tailed distributions such as the skew-t, skew-slash and the skew-contaminated normal distributions. The main advantage of these class of distributions is that they have a nice hierarchical representation that allows the implementation of Markov chain Monte Carlo (MCMC) methods to simulate samples from the joint posterior distribution. In order to examine the robust aspects of this flexible class, against outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback-Leibler divergence. Further, some discussions on the model selection criteria are given. The newly developed procedures are illustrated considering two simulations study, and a real data previously analyzed under normal and skew-normal nonlinear regression models. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most multidimensional projection techniques rely on distance (dissimilarity) information between data instances to embed high-dimensional data into a visual space. When data are endowed with Cartesian coordinates, an extra computational effort is necessary to compute the needed distances, making multidimensional projection prohibitive in applications dealing with interactivity and massive data. The novel multidimensional projection technique proposed in this work, called Part-Linear Multidimensional Projection (PLMP), has been tailored to handle multivariate data represented in Cartesian high-dimensional spaces, requiring only distance information between pairs of representative samples. This characteristic renders PLMP faster than previous methods when processing large data sets while still being competitive in terms of precision. Moreover, knowing the range of variation for data instances in the high-dimensional space, we can make PLMP a truly streaming data projection technique, a trait absent in previous methods.