Biblioteca Digital

996 resultados para trä

Sampling and learning the Mallows model under the Ulam distance

Relevância:

10.00% 10.00%

Publicador:

Resumo:

[EN]In this paper we deal with probability distributions over permutation spaces. The Probability model in use is the Mallows model. The distance for permutations that the model uses in the Ulam distance.

Veja mais

An R package for permutations, Mallows and Generalized Mallows models

Relevância:

10.00% 10.00%

Publicador:

Resumo:

[EN]Probability models on permutations associate a probability value to each of the permutations on n items. This paper considers two popular probability models, the Mallows model and the Generalized Mallows model. We describe methods for making inference, sampling and learning such distributions, some of which are novel in the literature. This paper also describes operations for permutations, with special attention in those related with the Kendall and Cayley distances and the random generation of permutations. These operations are of key importance for the efficient computation of the operations on distributions. These algorithms are implemented in the associated R package. Moreover, the internal code is written in C++.

Veja mais

Analysis of Spanish text-thesaurus as a complex network

Relevância:

10.00% 10.00%

Publicador:

Resumo:

[EN]Based on the theoretical tools of Complex Networks, this work provides a basic descriptive study of a synonyms dictionary, the Spanish Open Thesaurus represented as a graph. We study the main structural measures of the network compared with those of a random graph. Numerical results show that Open-Thesaurus is a graph whose topological properties approximate a scale-free network, but seems not to present the small-world property because of its sparse structure. We also found that the words of highest betweenness centrality are terms that suggest the vocabulary of psychoanalysis: placer (pleasure), ayudante (in the sense of assistant or worker), and regular (to regulate).

Veja mais

Simulações sobre a rentabilidade do FGTS

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As simulações indicam é que a distribuição de lucros poderá propiciar – desde que estabelecidos parâmetros adequados, e desde que preservadas as rentabilidades da carteira de investimentos do FGTS – uma elevação importante da atual remuneração das contas vinculadas dos trabalhadores, composta atualmente por TR + 3% ao ano.

Veja mais

Metrics of monograph use in the Marston Science Library

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As academic libraries are increasingly supported by a matrix of databases functions, the use of data mining and visualization techniques offer significant potential for future collection development and service initiatives based on quantifiable data. While data collection techniques are still not standardized and results may be skewed because of granularity problems, faulty algorithms, and a host of other factors, useful baseline data is extractable and broad trends can be identified. The purpose of the current study is to provide an initial assessment of data associated with science monograph collection at the Marston Science Library (MSL), University of Florida. These sciences fall within the major Library of Congress Classification schedules of Q, S, and T, excluding R, TN, TR, and TT. Overall strategy of this project is to look at the potential science audiences within the university community and analyze data related to purchasing and circulation patterns, e-book usage, and interlibrary loan statistics. While a longitudinal study from 2004 to the present would be ideal, this paper presents the results from the academic year July 1, 2008 to June 30, 2009 which was chosen as the pilot period because all data reservoirs identified above were available.

Veja mais

Efficient learning of decomposable models with a bounded clique size

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The learning of probability distributions from data is a ubiquitous problem in the fields of Statistics and Artificial Intelligence. During the last decades several learning algorithms have been proposed to learn probability distributions based on decomposable models due to their advantageous theoretical properties. Some of these algorithms can be used to search for a maximum likelihood decomposable model with a given maximum clique size, k, which controls the complexity of the model. Unfortunately, the problem of learning a maximum likelihood decomposable model given a maximum clique size is NP-hard for k > 2. In this work, we propose a family of algorithms which approximates this problem with a computational complexity of O(k · n^2 log n) in the worst case, where n is the number of implied random variables. The structures of the decomposable models that solve the maximum likelihood problem are called maximal k-order decomposable graphs. Our proposals, called fractal trees, construct a sequence of maximal i-order decomposable graphs, for i = 2, ..., k, in k − 1 steps. At each step, the algorithms follow a divide-and-conquer strategy based on the particular features of this type of structures. Additionally, we propose a prune-and-graft procedure which transforms a maximal k-order decomposable graph into another one, increasing its likelihood. We have implemented two particular fractal tree algorithms called parallel fractal tree and sequential fractal tree. These algorithms can be considered a natural extension of Chow and Liu’s algorithm, from k = 2 to arbitrary values of k. Both algorithms have been compared against other efficient approaches in artificial and real domains, and they have shown a competitive behavior to deal with the maximum likelihood problem. Due to their low computational complexity they are especially recommended to deal with high dimensional domains.

Veja mais

Extending Distance-based Ranking Models In Estimation of Distribution Algorithms

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recently, probability models on rankings have been proposed in the field of estimation of distribution algorithms in order to solve permutation-based combinatorial optimisation problems. Particularly, distance-based ranking models, such as Mallows and Generalized Mallows under the Kendall’s-t distance, have demonstrated their validity when solving this type of problems. Nevertheless, there are still many trends that deserve further study. In this paper, we extend the use of distance-based ranking models in the framework of EDAs by introducing new distance metrics such as Cayley and Ulam. In order to analyse the performance of the Mallows and Generalized Mallows EDAs under the Kendall, Cayley and Ulam distances, we run them on a benchmark of 120 instances from four well known permutation problems. The conducted experiments showed that there is not just one metric that performs the best in all the problems. However, the statistical test pointed out that Mallows-Ulam EDA is the most stable algorithm among the studied proposals.

Veja mais

Euskarazko denbora-egiturak etiketatzeko gidalerroak v1.0

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To interpret the temporal information on texts, a mark-up language that will code that information is needed, in order to make that information automatically reachable. The most used mark-up language is TimeML (Pustejovsky et al., 2003), which has also been choosen for Basque. In this guidelines we present the Basque version of ISO-TimeML (ISO-TimeML working group, 2008). After having analysed the tags, attributes and values created for English, we describe the most appropriate ones to represent Basque time structures’ information.

Veja mais

The 1998 HTK system for transcription of conversational telephone speech

Relevância:

10.00% 10.00%

Publicador:

Veja mais

On demand translation for querying incompletely aligned datasets

Relevância:

10.00% 10.00%

Publicador:

Resumo:

More and more users aim at taking advantage of the existing Linked Open Data environment to formulate a query over a dataset and to then try to process the same query over different datasets, one after another, in order to obtain a broader set of answers. However, the heterogeneity of vocabularies used in the datasets on the one side, and the fact that the number of alignments among those datasets is scarce on the other, makes that querying task difficult for them. Considering this scenario we present in this paper a proposal that allows on demand translations of queries formulated over an original dataset, into queries expressed using the vocabulary of a targeted dataset. Our approach relieves users from knowing the vocabulary used in the targeted datasets and even more it considers situations where alignments do not exist or they are not suitable for the formulated query. Therefore, in order to favour the possibility of getting answers, sometimes there is no guarantee of obtaining a semantically equivalent translation. The core component of our proposal is a query rewriting model that considers a set of transformation rules devised from a pragmatic point of view. The feasibility of our scheme has been validated with queries defined in well known benchmarks and SPARQL endpoint logs, as the obtained results confirm.

Veja mais