36 resultados para COMPUTER SCIENCE, INFORMATION SYSTEMS
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
Policy hierarchies and automated policy refinement are powerful approaches to simplify administration of security services in complex network environments. A crucial issue for the practical use of these approaches is to ensure the validity of the policy hierarchy, i.e. since the policy sets for the lower levels are automatically derived from the abstract policies (defined by the modeller), we must be sure that the derived policies uphold the high-level ones. This paper builds upon previous work on Model-based Management, particularly on the Diagram of Abstract Subsystems approach, and goes further to propose a formal validation approach for the policy hierarchies yielded by the automated policy refinement process. We establish general validation conditions for a multi-layered policy model, i.e. necessary and sufficient conditions that a policy hierarchy must satisfy so that the lower-level policy sets are valid refinements of the higher-level policies according to the criteria of consistency and completeness. Relying upon the validation conditions and upon axioms about the model representativeness, two theorems are proved to ensure compliance between the resulting system behaviour and the abstract policies that are modelled.
Resumo:
Developed countries have an even distribution of published papers on the seventeen model organisms. Developing countries have biased preferences for a few model organisms which are associated with endemic human diseases. A variant of the Hirsch-index, that we call the mean (mo)h-index (""model organism h-index""), shows an exponential relationship with the amount of papers published in each country on the selected model organisms. Developing countries cluster together with low mean (mo)h-indexes, even those with high number of publications. The growth curves of publications on the recent model Caenorhabditis elegans in developed countries shows different formats. We also analyzed the growth curves of indexed publications originating from developing countries. Brazil and South Korea were selected for this comparison. The most prevalent model organisms in those countries show different growth curves when compared to a global analysis, reflecting the size and composition of their research communities.
Resumo:
Predictive performance evaluation is a fundamental issue in design, development, and deployment of classification systems. As predictive performance evaluation is a multidimensional problem, single scalar summaries such as error rate, although quite convenient due to its simplicity, can seldom evaluate all the aspects that a complete and reliable evaluation must consider. Due to this, various graphical performance evaluation methods are increasingly drawing the attention of machine learning, data mining, and pattern recognition communities. The main advantage of these types of methods resides in their ability to depict the trade-offs between evaluation aspects in a multidimensional space rather than reducing these aspects to an arbitrarily chosen (and often biased) single scalar measure. Furthermore, to appropriately select a suitable graphical method for a given task, it is crucial to identify its strengths and weaknesses. This paper surveys various graphical methods often used for predictive performance evaluation. By presenting these methods in the same framework, we hope this paper may shed some light on deciding which methods are more suitable to use in different situations.
Resumo:
Searching in a dataset for elements that are similar to a given query element is a core problem in applications that manage complex data, and has been aided by metric access methods (MAMs). A growing number of applications require indices that must be built faster and repeatedly, also providing faster response for similarity queries. The increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper. we propose the Onion-tree, a new and robust dynamic memory-based MAM that slices the metric space into disjoint subspaces to provide quick indexing of complex data. It introduces three major characteristics: (i) a partitioning method that controls the number of disjoint subspaces generated at each node; (ii) a replacement technique that can change the leaf node pivots in insertion operations; and (iii) range and k-NN extended query algorithms to support the new partitioning method, including a new visit order of the subspaces in k-NN queries. Performance tests with both real-world and synthetic datasets showed that the Onion-tree is very compact. Comparisons of the Onion-tree with the MM-tree and a memory-based version of the Slim-tree showed that the Onion-tree was always faster to build the index. The experiments also showed that the Onion-tree significantly improved range and k-NN query processing performance and was the most efficient MAM, followed by the MM-tree, which in turn outperformed the Slim-tree in almost all the tests. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
In this paper, we propose a content selection framework that improves the users` experience when they are enriching or authoring pieces of news. This framework combines a variety of techniques to retrieve semantically related videos, based on a set of criteria which are specified automatically depending on the media`s constraints. The combination of different content selection mechanisms can improve the quality of the retrieved scenes, because each technique`s limitations are minimized by other techniques` strengths. We present an evaluation based on a number of experiments, which show that the retrieved results are better when all criteria are used at time.
Resumo:
In Information Visualization, adding and removing data elements can strongly impact the underlying visual space. We have developed an inherently incremental technique (incBoard) that maintains a coherent disposition of elements from a dynamic multidimensional data set on a 2D grid as the set changes. Here, we introduce a novel layout that uses pairwise similarity from grid neighbors, as defined in incBoard, to reposition elements on the visual space, free from constraints imposed by the grid. The board continues to be updated and can be displayed alongside the new space. As similar items are placed together, while dissimilar neighbors are moved apart, it supports users in the identification of clusters and subsets of related elements. Densely populated areas identified in the incSpace can be efficiently explored with the corresponding incBoard visualization, which is not susceptible to occlusion. The solution remains inherently incremental and maintains a coherent disposition of elements, even for fully renewed sets. The algorithm considers relative positions for the initial placement of elements, and raw dissimilarity to fine tune the visualization. It has low computational cost, with complexity depending only on the size of the currently viewed subset, V. Thus, a data set of size N can be sequentially displayed in O(N) time, reaching O(N (2)) only if the complete set is simultaneously displayed.
Resumo:
While watching TV, viewers use the remote control to turn the TV set on and off, change channel and volume, to adjust the image and audio settings, etc. Worldwide, research institutes collect information about audience measurement, which can also be used to provide personalization and recommendation services, among others. The interactive digital TV offers viewers the opportunity to interact with interactive applications associated with the broadcast program. Interactive TV infrastructure supports the capture of the user-TV interaction at fine-grained levels. In this paper we propose the capture of all the user interaction with a TV remote control-including short term and instant interactions: we argue that the corresponding captured information can be used to create content pervasively and automatically, and that this content can be used by a wide variety of services, such as audience measurement, personalization and recommendation services. The capture of fine grained data about instant and interval-based interactions also allows the underlying infrastructure to offer services at the same scale, such as annotation services and adaptative applications. We present the main modules of an infrastructure for TV-based services, along with a detailed example of a document used to record the user-remote control interaction. Our approach is evaluated by means of a proof-of-concept prototype which uses the Brazilian Digital TV System, the Ginga-NCL middleware.
Resumo:
This paper presents a new technique and two algorithms to bulk-load data into multi-way dynamic metric access methods, based on the covering radius of representative elements employed to organize data in hierarchical data structures. The proposed algorithms are sample-based, and they always build a valid and height-balanced tree. We compare the proposed algorithm with existing ones, showing the behavior to bulk-load data into the Slim-tree metric access method. After having identified the worst case of our first algorithm, we describe adequate counteractions in an elegant way creating the second algorithm. Experiments performed to evaluate their performance show that our bulk-loading methods build trees faster than the sequential insertion method regarding construction time, and that it also significantly improves search performance. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
This paper proposes a filter-based algorithm for feature selection. The filter is based on the partitioning of the set of features into clusters. The number of clusters, and consequently the cardinality of the subset of selected features, is automatically estimated from data. The computational complexity of the proposed algorithm is also investigated. A variant of this filter that considers feature-class correlations is also proposed for classification problems. Empirical results involving ten datasets illustrate the performance of the developed algorithm, which in general has obtained competitive results in terms of classification accuracy when compared to state of the art algorithms that find clusters of features. We show that, if computational efficiency is an important issue, then the proposed filter May be preferred over their counterparts, thus becoming eligible to join a pool of feature selection algorithms to be used in practice. As an additional contribution of this work, a theoretical framework is used to formally analyze some properties of feature selection methods that rely on finding clusters of features. (C) 2011 Elsevier Inc. All rights reserved.
Resumo:
This paper is about the use of natural language to communicate with computers. Most researches that have pursued this goal consider only requests expressed in English. A way to facilitate the use of several languages in natural language systems is by using an interlingua. An interlingua is an intermediary representation for natural language information that can be processed by machines. We propose to convert natural language requests into an interlingua [universal networking language (UNL)] and to execute these requests using software components. In order to achieve this goal, we propose OntoMap, an ontology-based architecture to perform the semantic mapping between UNL sentences and software components. OntoMap also performs component search and retrieval based on semantic information formalized in ontologies and rules.
Resumo:
This paper presents an approach for assisting low-literacy readers in accessing Web online information. The oEducational FACILITAo tool is a Web content adaptation tool that provides innovative features and follows more intuitive interaction models regarding accessibility concerns. Especially, we propose an interaction model and a Web application that explore the natural language processing tasks of lexical elaboration and named entity labeling for improving Web accessibility. We report on the results obtained from a pilot study on usability analysis carried out with low-literacy users. The preliminary results show that oEducational FACILITAo improves the comprehension of text elements, although the assistance mechanisms might also confuse users when word sense ambiguity is introduced, by gathering, for a complex word, a list of synonyms with multiple meanings. This fact evokes a future solution in which the correct sense for a complex word in a sentence is identified, solving this pervasive characteristic of natural languages. The pilot study also identified that experienced computer users find the tool to be more useful than novice computer users do.
Resumo:
An important feature of a database management systems (DBMS) is its client/server architecture, where managing shared memory among the clients and the server is always an tough issue. However, similarity queries are specially sensitive to this kind of architecture, since the answer sizes vary widely. Usually, the answers of similarity query are fully processed to be sent in full to the user, who often is interested in just parts of the answer, e.g. just few elements closer or farther to the query reference. Compelling the DBMS to retrieve the full answer, further ignoring its majority is at least a waste of server processing power. Paging the answer is a technique that splits the answer onto several pages, following client requests. Despite the success of paging on traditional queries, little work has been done to support it in similarity queries. In this work, we present a technique that not only provides paging in similarity range or k-nearest neighbor queries, but also supports them in two variations: the forward similarity query and the backward similarity query. They return elements either increasingly farther of increasingly closer to the query reference. The reported experiments show that, depending on the proportion of the interesting part over the full answer, both techniques allow answering queries much faster than it is obtained in the non-paged way. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
Document engineering is the computer science discipline that investigates systems for documents in any form and in all media. As with the relationship between software engineering and software, document engineering is concerned with principles, tools and processes that improve our ability to create, manage, and maintain documents (http://www.documentengineering.org). The ACM Symposium on Document Engineering is an annual meeting of researchers active in document engineering: it is sponsored by ACM by means of the ACM SIGWEB Special Interest Group. In this editorial, we first point to work carried out in the context of document engineering, which are directly related to multimedia tools and applications. We conclude with a summary of the papers presented in this special issue.
Resumo:
The pervasive and ubiquitous computing has motivated researches on multimedia adaptation which aims at matching the video quality to the user needs and device restrictions. This technique has a high computational cost which needs to be studied and estimated when designing architectures and applications. This paper presents an analytical model to quantify these video transcoding costs in a hardware independent way. The model was used to analyze the impact of transcoding delays in end-to-end live-video transmissions over LANs, MANs and WANs. Experiments confirm that the proposed model helps to define the best transcoding architecture for different scenarios.
Resumo:
A conceptual problem that appears in different contexts of clustering analysis is that of measuring the degree of compatibility between two sequences of numbers. This problem is usually addressed by means of numerical indexes referred to as sequence correlation indexes. This paper elaborates on why some specific sequence correlation indexes may not be good choices depending on the application scenario in hand. A variant of the Product-Moment correlation coefficient and a weighted formulation for the Goodman-Kruskal and Kendall`s indexes are derived that may be more appropriate for some particular application scenarios. The proposed and existing indexes are analyzed from different perspectives, such as their sensitivity to the ranks and magnitudes of the sequences under evaluation, among other relevant aspects of the problem. The results help suggesting scenarios within the context of clustering analysis that are possibly more appropriate for the application of each index. (C) 2008 Elsevier Inc. All rights reserved.