930 resultados para selection methods


Relevância:

60.00% 60.00%

Publicador:

Resumo:

R. Jensen, Q. Shen, Data Reduction with Rough Sets, In: Encyclopedia of Data Warehousing and Mining - 2nd Edition, Vol. II, 2008.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Tese apresentada à Universidade Fernando Pessoa como parte dos requisitos para obtenção do grau de Doutor em Ciências Sociais, especialidade em Sociologia

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fuzzy-neural-network-based inference systems are well-known universal approximators which can produce linguistically interpretable results. Unfortunately, their dimensionality can be extremely high due to an excessive number of inputs and rules, which raises the need for overall structure optimization. In the literature, various input selection methods are available, but they are applied separately from rule selection, often without considering the fuzzy structure. This paper proposes an integrated framework to optimize the number of inputs and the number of rules simultaneously. First, a method is developed to select the most significant rules, along with a refinement stage to remove unnecessary correlations. An improved information criterion is then proposed to find an appropriate number of inputs and rules to include in the model, leading to a balanced tradeoff between interpretability and accuracy. Simulation results confirm the efficacy of the proposed method.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many of the most interesting questions ecologists ask lead to analyses of spatial data. Yet, perhaps confused by the large number of statistical models and fitting methods available, many ecologists seem to believe this is best left to specialists. Here, we describe the issues that need consideration when analysing spatial data and illustrate these using simulation studies. Our comparative analysis involves using methods including generalized least squares, spatial filters, wavelet revised models, conditional autoregressive models and generalized additive mixed models to estimate regression coefficients from synthetic but realistic data sets, including some which violate standard regression assumptions. We assess the performance of each method using two measures and using statistical error rates for model selection. Methods that performed well included generalized least squares family of models and a Bayesian implementation of the conditional auto-regressive model. Ordinary least squares also performed adequately in the absence of model selection, but had poorly controlled Type I error rates and so did not show the improvements in performance under model selection when using the above methods. Removing large-scale spatial trends in the response led to poor performance. These are empirical results; hence extrapolation of these findings to other situations should be performed cautiously. Nevertheless, our simulation-based approach provides much stronger evidence for comparative analysis than assessments based on single or small numbers of data sets, and should be considered a necessary foundation for statements of this type in future.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Over the last 15 years, the supernova community has endeavoured to directly identify progenitor stars for core-collapse supernovae discovered in nearby galaxies. These precursors are often visible as resolved stars in high-resolution images from space-and ground-based telescopes. The discovery rate of progenitor stars is limited by the local supernova rate and the availability and depth of archive images of galaxies, with 18 detections of precursor objects and 27 upper limits. This review compiles these results (from 1999 to 2013) in a distance-limited sample and discusses the implications of the findings. The vast majority of the detections of progenitor stars are of type II-P, II-L, or IIb with one type Ib progenitor system detected and many more upper limits for progenitors of Ibc supernovae (14 in all). The data for these 45 supernovae progenitors illustrate a remarkable deficit of high-luminosity stars above an apparent limit of log L/L-circle dot similar or equal to 5.1 dex. For a typical Salpeter initial mass function, one would expect to have found 13 high-luminosity and high-mass progenitors by now. There is, possibly, only one object in this time-and volume-limited sample that is unambiguously high-mass (the progenitor of SN2009ip) although the nature of that supernovae is still debated. The possible biases due to the influence of circumstellar dust, the luminosity analysis, and sample selection methods are reviewed. It does not appear likely that these can explain the missing high-mass progenitor stars. This review concludes that the community's work to date shows that the observed populations of supernovae in the local Universe are not, on the whole, produced by high-mass (M greater than or similar to 18 M-circle dot) stars. Theoretical explosions of model stars also predict that black hole formation and failed supernovae tend to occur above an initial mass of M similar or equal to 18 M-circle dot. The models also suggest there is no simple single mass division for neutron star or black-hole formation and that there are islands of explodability for stars in the 8-120 M-circle dot range. The observational constraints are quite consistent with the bulk of stars above M similar or equal to 18 M-circle dot collapsing to form black holes with no visible supernovae.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Esta tese propõe uma forma diferente de navegação de robôs em ambientes dinâmicos, onde o robô tira partido do movimento de pedestres, com o objetivo de melhorar as suas capacidades de navegação. A ideia principal é que, ao invés de tratar as pessoas como obstáculos dinâmicos que devem ser evitados, elas devem ser tratadas como agentes especiais com conhecimento avançado em navegação em ambientes dinâmicos. Para se beneficiar do movimento de pedestres, este trabalho propõe que um robô os selecione e siga, de modo que possa mover-se por caminhos ótimos, desviar-se de obstáculos não detetados, melhorar a navegação em ambientes densamente populados e aumentar a sua aceitação por outros humanos. Para atingir estes objetivos, novos métodos são desenvolvidos na área da seleção de líderes, onde duas técnicas são exploradas. A primeira usa métodos de previsão de movimento, enquanto a segunda usa técnicas de aprendizagem por máquina, para avaliar a qualidade de candidatos a líder, onde o treino é feito com exemplos reais. Os métodos de seleção de líder são integrados com algoritmos de planeamento de movimento e experiências são realizadas para validar as técnicas propostas.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2013

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2014

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Les documents publiés par des entreprises, tels les communiqués de presse, contiennent une foule d’informations sur diverses activités des entreprises. C’est une source précieuse pour des analyses en intelligence d’affaire. Cependant, il est nécessaire de développer des outils pour permettre d’exploiter cette source automatiquement, étant donné son grand volume. Ce mémoire décrit un travail qui s’inscrit dans un volet d’intelligence d’affaire, à savoir la détection de relations d’affaire entre les entreprises décrites dans des communiqués de presse. Dans ce mémoire, nous proposons une approche basée sur la classification. Les méthodes de classifications existantes ne nous permettent pas d’obtenir une performance satisfaisante. Ceci est notamment dû à deux problèmes : la représentation du texte par tous les mots, qui n’aide pas nécessairement à spécifier une relation d’affaire, et le déséquilibre entre les classes. Pour traiter le premier problème, nous proposons une approche de représentation basée sur des mots pivots c’est-à-dire les noms d’entreprises concernées, afin de mieux cerner des mots susceptibles de les décrire. Pour le deuxième problème, nous proposons une classification à deux étapes. Cette méthode s’avère plus appropriée que les méthodes traditionnelles de ré-échantillonnage. Nous avons testé nos approches sur une collection de communiqués de presse dans le domaine automobile. Nos expérimentations montrent que les approches proposées peuvent améliorer la performance de classification. Notamment, la représentation du document basée sur les mots pivots nous permet de mieux centrer sur les mots utiles pour la détection de relations d’affaire. La classification en deux étapes apporte une solution efficace au problème de déséquilibre entre les classes. Ce travail montre que la détection automatique des relations d’affaire est une tâche faisable. Le résultat de cette détection pourrait être utilisé dans une analyse d’intelligence d’affaire.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Aunque el concepto de sabiduría ha sido ampliamente estudiado por expertos de áreas como la filosofía, la religión y la psicología, aún enfrenta limitaciones en cuanto a su definición y evaluación. Por esto, el presente trabajo tiene como objetivo, formular una definición del concepto de sabiduría que permita realizar una propuesta de evaluación del concepto como competencia en los gerentes. Para esto, se realizó un análisis documental de tipo cualitativo. De esta manera, se analizaron diversos textos sobre la historia, las definiciones y las metodologías para evaluar tanto la sabiduría como las competencias; diferenciando la sabiduría de otros constructos y analizando la diferencia entre las competencias generales y las gerenciales para posteriormente, definir la sabiduría como una competencia gerencial. Como resultado de este análisis se generó un prototipo de prueba denominado SAPIENS-O, a través del cuál se busca evaluar la sabiduría como competencia gerencial. Como alcances del instrumento se pueden identificar la posibilidad de medir la sabiduría como competencia en los gerentes, la posibilidad de dar un nuevo panorama a las dificultades teóricas y empíricas sobre la sabiduría y la posibilidad de facilitar el estudio de la sabiduría en ambientes reales, más específicamente en ambientes organizacionales.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Gaussian multi-scale representation is a mathematical framework that allows to analyse images at different scales in a consistent manner, and to handle derivatives in a way deeply connected to scale. This paper uses Gaussian multi-scale representation to investigate several aspects of the derivation of atmospheric motion vectors (AMVs) from water vapour imagery. The contribution of different spatial frequencies to the tracking is studied, for a range of tracer sizes, and a number of tracer selection methods are presented and compared, using WV 6.2 images from the geostationary satellite MSG-2.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: The electroencephalogram (EEG) may be described by a large number of different feature types and automated feature selection methods are needed in order to reliably identify features which correlate with continuous independent variables. New method: A method is presented for the automated identification of features that differentiate two or more groups inneurologicaldatasets basedupona spectraldecompositionofthe feature set. Furthermore, the method is able to identify features that relate to continuous independent variables. Results: The proposed method is first evaluated on synthetic EEG datasets and observed to reliably identify the correct features. The method is then applied to EEG recorded during a music listening task and is observed to automatically identify neural correlates of music tempo changes similar to neural correlates identified in a previous study. Finally,the method is applied to identify neural correlates of music-induced affective states. The identified neural correlates reside primarily over the frontal cortex and are consistent with widely reported neural correlates of emotions. Comparison with existing methods: The proposed method is compared to the state-of-the-art methods of canonical correlation analysis and common spatial patterns, in order to identify features differentiating synthetic event-related potentials of different amplitudes and is observed to exhibit greater performance as the number of unique groups in the dataset increases. Conclusions: The proposed method is able to identify neural correlates of continuous variables in EEG datasets and is shown to outperform canonical correlation analysis and common spatial patterns.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In a global economy, manufacturers mainly compete with cost efficiency of production, as the price of raw materials are similar worldwide. Heavy industry has two big issues to deal with. On the one hand there is lots of data which needs to be analyzed in an effective manner, and on the other hand making big improvements via investments in cooperate structure or new machinery is neither economically nor physically viable. Machine learning offers a promising way for manufacturers to address both these problems as they are in an excellent position to employ learning techniques with their massive resource of historical production data. However, choosing modelling a strategy in this setting is far from trivial and this is the objective of this article. The article investigates characteristics of the most popular classifiers used in industry today. Support Vector Machines, Multilayer Perceptron, Decision Trees, Random Forests, and the meta-algorithms Bagging and Boosting are mainly investigated in this work. Lessons from real-world implementations of these learners are also provided together with future directions when different learners are expected to perform well. The importance of feature selection and relevant selection methods in an industrial setting are further investigated. Performance metrics have also been discussed for the sake of completion.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

1. Studies of landscape change are seldom conducted at scales commensurate with the processes they purport to investigate. Landscape change is a landscape-level process, yet most studies focus on patches. Even when landscape context is considered, inference remains at the patch-level. The unit of investigation must be extended beyond individual patches to whole mosaics in order to advance understanding of faunal responses to landscape change.

2. In this study, we aggregated data from multiple sites per landscape such that both the response and explanatory variables characterized 'whole' landscapes, allowing for landscape-level inference about factors influencing species' incidence.

3. We used hierarchical partitioning and Bayesian variable selection methods to develop species-specific models that examined the influence of four categories of landscape properties – habitat extent, habitat configuration, landscape composition and geographical location – on the incidence of 58 species of woodland-dependent birds in 24 agricultural landscapes (each 100 km2) in south-eastern Australia.

4. There was strong evidence for a positive effect of habitat extent for 27 species. Thirty species were related to at least one of the four landscape composition variables, and geographical location was important for 19 species. Habitat configuration was influential for 13 species and where important, the impacts of fragmentation per se were detrimental.

5. Variation among species in the influential landscape variables indicates that different species respond to different sets of cues in land mosaics. Thus, although all species were grouped a priori as 'woodland-dependent', expectations based on general ecological characteristics may prove unreliable.

6. Synthesis and applications. These results underscore the value of moving beyond the fragmentation paradigm focused on the spatial pattern of habitat vs. non-habitat, to a greater appreciation of the composition and heterogeneity of land mosaics. Landscape-level inference will enable improved conservation outcomes by recognizing the influence of landscape properties on biota and devising strategies at this scale to complement patch-based management. We provide strong empirical evidence that biodiversity management in agricultural landscapes must focus on habitat extent. Complementary management of other landscape attributes, such as habitat aggregation and intensity of agricultural land-use, will also enhance the value of agricultural landscapes for woodland birds.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recently, much attention has been given to the mass spectrometry (MS) technology based disease classification, diagnosis, and protein-based biomarker identification. Similar to microarray based investigation, proteomic data generated by such kind of high-throughput experiments are often with high feature-to-sample ratio. Moreover, biological information and pattern are compounded with data noise, redundancy and outliers. Thus, the development of algorithms and procedures for the analysis and interpretation of such kind of data is of paramount importance. In this paper, we propose a hybrid system for analyzing such high dimensional data. The proposed method uses the k-mean clustering algorithm based feature extraction and selection procedure to bridge the filter selection and wrapper selection methods. The potential informative mass/charge (m/z) markers selected by filters are subject to the k-mean clustering algorithm for correlation and redundancy reduction, and a multi-objective Genetic Algorithm selector is then employed to identify discriminative m/z markers generated by k-mean clustering algorithm. Experimental results obtained by using the proposed method indicate that it is suitable for m/z biomarker selection and MS based sample classification.