881 resultados para Feature Felection
Resumo:
Although many feature selection methods for classification have been developed, there is a need to identify genes in high-dimensional data with censored survival outcomes. Traditional methods for gene selection in classification problems have several drawbacks. First, the majority of the gene selection approaches for classification are single-gene based. Second, many of the gene selection procedures are not embedded within the algorithm itself. The technique of random forests has been found to perform well in high-dimensional data settings with survival outcomes. It also has an embedded feature to identify variables of importance. Therefore, it is an ideal candidate for gene selection in high-dimensional data with survival outcomes. In this paper, we develop a novel method based on the random forests to identify a set of prognostic genes. We compare our method with several machine learning methods and various node split criteria using several real data sets. Our method performed well in both simulations and real data analysis.Additionally, we have shown the advantages of our approach over single-gene-based approaches. Our method incorporates multivariate correlations in microarray data for survival outcomes. The described method allows us to better utilize the information available from microarray data with survival outcomes.
Resumo:
The last few years have seen a substantial increase in the geometric complexity for 3D flow simulation. In this paper we describe the challenges in generating computation grids for 3D aerospace configuations and demonstrate the progress made to eventually achieve a push button technology for CAD to visualized flow. Special emphasis is given to the interfacing from the grid generator to the flow solver by semi-automatic generation of boundary conditions during the grid generation process. In this regard, once a grid has been generated, push button technology of most commercial flow solvers has been achieved. This will be demonstrated by the ad hoc simulation for the Hopper configuration.
Resumo:
Traditionally, marine ecosystem structure was thought to be bottom-up controlled. In recent years, a number of studies have highlighted the importance of top-down regulation. Evidence is accumulating that the type of trophic forcing varies temporally and spatially, and an integrated view – considering the interplay of both types of control – is emerging. Correlations between time series spanning several decades of the abundances of adjacent trophic levels are conventionally used to assess the type of control: bottom-up if positive or top-down if this is negative. This approach implies averaging periods which might show time-varying dynamics and therefore can hide part of this temporal variability. Using spatially referenced plankton information extracted from the Continuous Plankton Recorder, this study addresses the potential dynamic character of the trophic structure at the planktonic level in the North Sea by assessing its variation over both temporal and spatial scales. Our results show that until the early-1970s a bottom-up control characterized the base of the food web across the whole North Sea, with diatoms having a positive and homogeneous effect on zooplankton filter-feeders. Afterwards, different regional trophic dynamics were observed, in particular a negative relationship between total phytoplankton and zooplankton was detected off the west coast of Norway and the Skagerrak as opposed to a positive one in the southern reaches. Our results suggest that after the early 1970s diatoms remained the main food source for zooplankton filter-feeders east of Orkney–Shetland and off Scotland, while in the east, from the Norwegian Trench to the German Bight, filter-feeders were mainly sustained by dinoflagellates.
Resumo:
This paper provides a summary of our studies on robust speech recognition based on a new statistical approach – the probabilistic union model. We consider speech recognition given that part of the acoustic features may be corrupted by noise. The union model is a method for basing the recognition on the clean part of the features, thereby reducing the effect of the noise on recognition. To this end, the union model is similar to the missing feature method. However, the two methods achieve this end through different routes. The missing feature method usually requires the identity of the noisy data for noise removal, while the union model combines the local features based on the union of random events, to reduce the dependence of the model on information about the noise. We previously investigated the applications of the union model to speech recognition involving unknown partial corruption in frequency band, in time duration, and in feature streams. Additionally, a combination of the union model with conventional noise-reduction techniques was studied, as a means of dealing with a mixture of known or trainable noise and unknown unexpected noise. In this paper, a unified review, in the context of dealing with unknown partial feature corruption, is provided into each of these applications, giving the appropriate theory and implementation algorithms, along with an experimental evaluation.
Resumo:
Feature selection and feature weighting are useful techniques for improving the classification accuracy of K-nearest-neighbor (K-NN) rule. The term feature selection refers to algorithms that select the best subset of the input feature set. In feature weighting, each feature is multiplied by a weight value proportional to the ability of the feature to distinguish pattern classes. In this paper, a novel hybrid approach is proposed for simultaneous feature selection and feature weighting of K-NN rule based on Tabu Search (TS) heuristic. The proposed TS heuristic in combination with K-NN classifier is compared with several classifiers on various available data sets. The results have indicated a significant improvement in the performance in classification accuracy. The proposed TS heuristic is also compared with various feature selection algorithms. Experiments performed revealed that the proposed hybrid TS heuristic is superior to both simple TS and sequential search algorithms. We also present results for the classification of prostate cancer using multispectral images, an important problem in biomedicine.
Resumo:
The use of image processing techniques to assess the performance of airport landing lighting using images of it collected from an aircraft-mounted camera is documented. In order to assess the performance of the lighting, it is necessary to uniquely identify each luminaire within an image and then track the luminaires through the entire sequence and store the relevant information for each luminaire, that is, the total number of pixels that each luminaire covers and the total grey level of these pixels. This pixel grey level can then be used for performance assessment. The authors propose a robust model-based (MB) featurematching technique by which the performance is assessed. The development of this matching technique is the key to the automated performance assessment of airport lighting. The MB matching technique utilises projective geometry in addition to accurate template of the 3D model of a landing-lighting system. The template is projected onto the image data and an optimum match found, using nonlinear least-squares optimisation. The MB matching software is compared with standard feature extraction and tracking techniques known within the community, these being the Kanade–Lucus–Tomasi (KLT) and scaleinvariant feature transform (SIFT) techniques. The new MB matching technique compares favourably with the SIFT and KLT feature-tracking alternatives. As such, it provides a solid foundation to achieve the central aim of this research which is to automatically assess the performance of airport lighting.