4 resultados para Probability density function
em Universidad de Alicante
Resumo:
In this paper, we propose a novel filter for feature selection. Such filter relies on the estimation of the mutual information between features and classes. We bypass the estimation of the probability density function with the aid of the entropic-graphs approximation of Rényi entropy, and the subsequent approximation of the Shannon one. The complexity of such bypassing process does not depend on the number of dimensions but on the number of patterns/samples, and thus the curse of dimensionality is circumvented. We show that it is then possible to outperform a greedy algorithm based on the maximal relevance and minimal redundancy criterion. We successfully test our method both in the contexts of image classification and microarray data classification.
Resumo:
In this paper, we prove that infinite-dimensional vector spaces of α-dense curves are generated by means of the functional equations f(x)+f(2x)+⋯+f(nx)=0, with n≥2, which are related to the partial sums of the Riemann zeta function. These curves α-densify a large class of compact sets of the plane for arbitrary small α, extending the known result that this holds for the cases n=2,3. Finally, we prove the existence of a family of solutions of such functional equation which has the property of quadrature in the compact that densifies, that is, the product of the length of the curve by the nth power of the density approaches the Jordan content of the compact set which the curve densifies.
Resumo:
Studies on positive plant–plant relations have traditionally focused on pair-wise interactions. Conversely, the interaction with other co-occurring species has scarcely been addressed, despite the fact that the entire community may affect plant performance. We used woody vegetation patches as models to evaluate community facilitation in semi-arid steppes. We characterized biotic and physical attributes of 53 woody patches (patch size, litter accumulation, canopy density, vegetation cover, species number and identity, and phylogenetic distance), and soil fertility (organic C and total N), and evaluated their relative importance for the performance of seedlings of Pistacia lentiscus, a keystone woody species in western Mediterranean steppes. Seedlings were planted underneath the patches, and on their northern and southern edges. Woody patches positively affected seedling survival but not seedling growth. Soil fertility was higher underneath the patches than elsewhere. Physical and biotic attributes of woody patches affected seedling survival, but these effects depended on microsite conditions. The composition of the community of small shrubs and perennial grasses growing underneath the patches controlled seedling performance. An increase in Stipa tenacissima and a decrease in Brachypodium retusum increased the probability of survival. The cover of these species and other small shrubs, litter depth and community phylogenetic distance, were also related to seedling survival. Seedlings planted on the northern edge of the patches were mostly affected by attributes of the biotic community. These traits were of lesser importance in seedlings planted underneath and in the southern edge of patches, suggesting that constraints to seedling establishment differed within the patches. Our study highlights the importance of taking into consideration community attributes over pair-wise interactions when evaluating the outcome of ecological interactions in multi-specific communities, as they have profound implications in the composition, function and management of semi-arid steppes.
Resumo:
This paper proposes an adaptive algorithm for clustering cumulative probability distribution functions (c.p.d.f.) of a continuous random variable, observed in different populations, into the minimum homogeneous clusters, making no parametric assumptions about the c.p.d.f.’s. The distance function for clustering c.p.d.f.’s that is proposed is based on the Kolmogorov–Smirnov two sample statistic. This test is able to detect differences in position, dispersion or shape of the c.p.d.f.’s. In our context, this statistic allows us to cluster the recorded data with a homogeneity criterion based on the whole distribution of each data set, and to decide whether it is necessary to add more clusters or not. In this sense, the proposed algorithm is adaptive as it automatically increases the number of clusters only as necessary; therefore, there is no need to fix in advance the number of clusters. The output of the algorithm are the common c.p.d.f. of all observed data in the cluster (the centroid) and, for each cluster, the Kolmogorov–Smirnov statistic between the centroid and the most distant c.p.d.f. The proposed algorithm has been used for a large data set of solar global irradiation spectra distributions. The results obtained enable to reduce all the information of more than 270,000 c.p.d.f.’s in only 6 different clusters that correspond to 6 different c.p.d.f.’s.