4 resultados para Data selection

em Archivo Digital para la Docencia y la Investigación - Repositorio Institucional de la Universidad del País Vasco


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hyper-spectral data allows the construction of more robust statistical models to sample the material properties than the standard tri-chromatic color representation. However, because of the large dimensionality and complexity of the hyper-spectral data, the extraction of robust features (image descriptors) is not a trivial issue. Thus, to facilitate efficient feature extraction, decorrelation techniques are commonly applied to reduce the dimensionality of the hyper-spectral data with the aim of generating compact and highly discriminative image descriptors. Current methodologies for data decorrelation such as principal component analysis (PCA), linear discriminant analysis (LDA), wavelet decomposition (WD), or band selection methods require complex and subjective training procedures and in addition the compressed spectral information is not directly related to the physical (spectral) characteristics associated with the analyzed materials. The major objective of this article is to introduce and evaluate a new data decorrelation methodology using an approach that closely emulates the human vision. The proposed data decorrelation scheme has been employed to optimally minimize the amount of redundant information contained in the highly correlated hyper-spectral bands and has been comprehensively evaluated in the context of non-ferrous material classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Recently, with the access of low toxicity biological and targeted therapies, evidence of the existence of a long-term survival subpopulation of cancer patients is appearing. We have studied an unselected population with advanced lung cancer to look for evidence of multimodality in survival distribution, and estimate the proportion of long-term survivors. Methods: We used survival data of 4944 patients with non-small-cell lung cancer (NSCLC) stages IIIb-IV at diagnostic, registered in the National Cancer Registry of Cuba (NCRC) between January 1998 and December 2006. We fitted one-component survival model and two-component mixture models to identify short-and long-term survivors. Bayesian information criterion was used for model selection. Results: For all of the selected parametric distributions the two components model presented the best fit. The population with short-term survival (almost 4 months median survival) represented 64% of patients. The population of long-term survival included 35% of patients, and showed a median survival around 12 months. None of the patients of short-term survival was still alive at month 24, while 10% of the patients of long-term survival died afterwards. Conclusions: There is a subgroup showing long-term evolution among patients with advanced lung cancer. As survival rates continue to improve with the new generation of therapies, prognostic models considering short-and long-term survival subpopulations should be considered in clinical research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Study of emotions in human-computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We aimed to study the selective pressures interacting on SLC45A2 to investigate the interplay between selection and susceptibility to disease. Thus, we enrolled 500 volunteers from a geographically limited population (Basques from the North of Spain) and by resequencing the whole coding region and intron 5 of the 34 most and the 34 least pigmented individuals according to the reflectance distribution, we observed that the polymorphism Leu374Phe (L374F, rs16891982) was statistically associated with skin color variability within this sample. In particular, allele 374F was significantly more frequent among the individuals with lighter skin. Further genotyping an independent set of 558 individuals of a geographically wider population with known ancestry in the Spanish population also revealed that the frequency of L374F was significantly correlated with the incident UV radiation intensity. Selection tests suggest that allele 374F is being positively selected in South Europeans, thus indicating that depigmentation is an adaptive process. Interestingly, by genotyping 119 melanoma samples, we show that this variant is also associated with an increased susceptibility to melanoma in our populations. The ultimate driving force for this adaptation is unknown, but it is compatible with the vitamin D hypothesis. This shows that molecular evolution analysis can be used as a useful technology to predict phenotypic and biomedical consequences in humans.