48 resultados para speaker clustering
Resumo:
We present a generator of random networks where both the degree-dependent clustering coefficient and the degree distribution are tunable. Following the same philosophy as in the configuration model, the degree distribution and the clustering coefficient for each class of nodes of degree k are fixed ad hoc and a priori. The algorithm generates corresponding topologies by applying first a closure of triangles and second the classical closure of remaining free stubs. The procedure unveils an universal relation among clustering and degree-degree correlations for all networks, where the level of assortativity establishes an upper limit to the level of clustering. Maximum assortativity ensures no restriction on the decay of the clustering coefficient whereas disassortativity sets a stronger constraint on its behavior. Correlation measures in real networks are seen to observe this structural bound.
Resumo:
Background: The trithorax group (trxG) and Polycomb group (PcG) proteins are responsible for the maintenance of stable transcriptional patterns of many developmental regulators. They bind to specific regions of DNA and direct the post-translational modifications of histones, playing a role in the dynamics of chromatin structure.Results: We have performed genome-wide expression studies of trx and ash2 mutants in Drosophila melanogaster. Using computational analysis of our microarray data, we have identified 25 clusters of genes potentially regulated by TRX. Most of these clusters consist of genes that encode structural proteins involved in cuticle formation. This organization appears to be a distinctive feature of the regulatory networks of TRX and other chromatin regulators, since we have observed the same arrangement in clusters after experiments performed with ASH2, as well as in experiments performed by others with NURF, dMyc, and ASH1. We have also found many of these clusters to be significantly conserved in D. simulans, D. yakuba, D. pseudoobscura and partially in Anopheles gambiae.Conclusion: The analysis of genes governed by chromatin regulators has led to the identification of clusters of functionally related genes conserved in other insect species, suggesting this chromosomal organization is biologically important. Moreover, our results indicate that TRX and other chromatin regulators may act globally on chromatin domains that contain transcriptionally co-regulated genes.
Resumo:
Biometric system performance can be improved by means of data fusion. Several kinds of information can be fused in order to obtain a more accurate classification (identification or verification) of an input sample. In this paper we present a method for computing the weights in a weighted sum fusion for score combinations, by means of a likelihood model. The maximum likelihood estimation is set as a linear programming problem. The scores are derived from a GMM classifier working on a different feature extractor. Our experimental results assesed the robustness of the system in front a changes on time (different sessions) and robustness in front a change of microphone. The improvements obtained were significantly better (error bars of two standard deviations) than a uniform weighted sum or a uniform weighted product or the best single classifier. The proposed method scales computationaly with the number of scores to be fussioned as the simplex method for linear programming.
Resumo:
In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results for speaker recognition shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone, and for speaker identification can reduce the minimum detection cost function with saturated test sentences from 6.42% to 4.15%, while the results with clean speech (without saturation) is 5.74% for one microphone and 7.02% for the other one.
Resumo:
We uncover the global organization of clustering in real complex networks. To this end, we ask whether triangles in real networks organize as in maximally random graphs with given degree and clustering distributions, or as in maximally ordered graph models where triangles are forced into modules. The answer comes by way of exploring m-core landscapes, where the m-core is defined, akin to the k-core, as the maximal subgraph with edges participating in at least m triangles. This property defines a set of nested subgraphs that, contrarily to k-cores, is able to distinguish between hierarchical and modular architectures. We find that the clustering organization in real networks is neither completely random nor ordered although, surprisingly, it is more random than modular. This supports the idea that the structure of real networks may in fact be the outcome of self-organized processes based on local optimization rules, in contrast to global optimization principles.
Resumo:
Zonal management in vineyards requires the prior delineation of stable yield zones within the parcel. Among the different methodologies used for zone delineation, cluster analysis of yield data from several years is one of the possibilities cited in scientific literature. However, there exist reasonable doubts concerning the cluster algorithm to be used and the number of zones that have to be delineated within a field. In this paper two different cluster algorithms have been compared (k-means and fuzzy c-means) using the grape yield data corresponding to three successive years (2002, 2003 and 2004), for a ‘Pinot Noir’ vineyard parcel. Final choice of the most recommendable algorithm has been linked to obtaining a stable pattern of spatial yield distribution and to allowing for the delineation of compact and average sized areas. The general recommendation is to use reclassified maps of two clusters or yield classes (low yield zone and high yield zone) and, consequently, the site-specific vineyard management should be based on the prior delineation of just two different zones or sub-parcels. The two tested algorithms are good options for this purpose. However, the fuzzy c-means algorithm allows for a better zoning of the parcel, forming more compact areas and with more equilibrated zonal differences over time.
Resumo:
In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone.
Resumo:
Peer-reviewed
A new approach to segmentation based on fusing circumscribed contours, region growing and clustering
Resumo:
One of the major problems in machine vision is the segmentation of images of natural scenes. This paper presents a new proposal for the image segmentation problem which has been based on the integration of edge and region information. The main contours of the scene are detected and used to guide the posterior region growing process. The algorithm places a number of seeds at both sides of a contour allowing stating a set of concurrent growing processes. A previous analysis of the seeds permits to adjust the homogeneity criterion to the regions's characteristics. A new homogeneity criterion based on clustering analysis and convex hull construction is proposed
Resumo:
This paper provides empirical evidence that continuous time models with one factor of volatility, in some conditions, are able to fit the main characteristics of financial data. It also reports the importance of the feedback factor in capturing the strong volatility clustering of data, caused by a possible change in the pattern of volatility in the last part of the sample. We use the Efficient Method of Moments (EMM) by Gallant and Tauchen (1996) to estimate logarithmic models with one and two stochastic volatility factors (with and without feedback) and to select among them.
Resumo:
Delayed perfect monitoring in an infinitely repeated discounted game is modelled by letting the players form a connected and undirected network. Players observe their immediate neighbors' behavior only, but communicate over time the repeated game's history truthfully throughout the network. The Folk Theorem extends to this setup, although for a range of discount factors strictly below 1, the set of sequential equilibria and the corresponding payoff set may be reduced. A general class of games is analyzed without imposing restrictions on the dimensionality of the payoff space. This and the bilateral communication structure allow for limited results under strategic communication only. As a by-product this model produces a network result; namely, the level of cooperation in this setup depends on the network's diameter, and not on its clustering coefficient as in other models.
Resumo:
One of the main implications of the efficient market hypothesis (EMH) is that expected future returns on financial assets are not predictable if investors are risk neutral. In this paper we argue that financial time series offer more information than that this hypothesis seems to supply. In particular we postulate that runs of very large returns can be predictable for small time periods. In order to prove this we propose a TAR(3,1)-GARCH(1,1) model that is able to describe two different types of extreme events: a first type generated by large uncertainty regimes where runs of extremes are not predictable and a second type where extremes come from isolated dread/joy events. This model is new in the literature in nonlinear processes. Its novelty resides on two features of the model that make it different from previous TAR methodologies. The regimes are motivated by the occurrence of extreme values and the threshold variable is defined by the shock affecting the process in the preceding period. In this way this model is able to uncover dependence and clustering of extremes in high as well as in low volatility periods. This model is tested with data from General Motors stocks prices corresponding to two crises that had a substantial impact in financial markets worldwide; the Black Monday of October 1987 and September 11th, 2001. By analyzing the periods around these crises we find evidence of statistical significance of our model and thereby of predictability of extremes for September 11th but not for Black Monday. These findings support the hypotheses of a big negative event producing runs of negative returns in the first case, and of the burst of a worldwide stock market bubble in the second example. JEL classification: C12; C15; C22; C51 Keywords and Phrases: asymmetries, crises, extreme values, hypothesis testing, leverage effect, nonlinearities, threshold models
Resumo:
The advent of the European Union has decreased the diversification benefits available from country based equity market indices in the region. This paper measures the increase in stock integration between the three largest new EU members (Hungary, the Czech Republic and Poland who joined in May 2004) and the Euro-zone. A potentially gradual transition in correlations is accommodated in a single VAR model by embedding smooth transition conditional correlation models with fat tails, spillovers, volatility clustering, and asymmetric volatility effects. At the country market index level all three Eastern European markets show a considerable increase in correlations in 2006. At the industry level the dates and transition periods for the correlations differ, and the correlations are lower although also increasing. The results show that sectoral indices in Eastern European markets may provide larger diversification opportunities than the aggregate market. JEL classifications: C32; C51; F36; G15 Keywords: Multivariate GARCH; Smooth Transition Conditional Correlation; Stock Return Comovement; Sectoral correlations; New EU Members
Resumo:
Les tècniques de clustering poden ajudar a reduir la supervisió en processos d'obtenció de patrons per a Extracció d'Informació. En aquest treball, que abarca un període de 4 anys de recerca, es comença per estudiar la representació de documents més adequada per a la tasca de clustering. Per tal d'evitar els biaixos dels mètodes individuals de clustering, es consideren mètodes de clustering conjunt. S'exploren diversos mètodes de combinació supervisada, i s'hi afegeixen estratègies automàtiques per a determinar el nombre de clusters de la combinació. També es consideren mecanismes per a obtenir clusterings conjunts ponderats, així com estratègies de combinació no supervisada. Finalment, els resultats del clustering s'utilitzen en un sistema d'adquisició de patrons per a substituir els elements de supervisió humana. Totes aquestes estratègies i mètodes s'avaluen en tasques de clustering de documents i adquisició de patrons usant dades reals. Es comprova que els mots com representació de documents superen altres models per a la tasca de clustering, així com que el clustering conjunt supera les limitacions dels clusterings individuals, i que les estratègies no supervisades d'adquisició de patrons obtenen resultats competitius respecte a les estratègies supervisades.