7 resultados para correlation-based feature selection
em Helda - Digital Repository of University of Helsinki
Resumo:
The use of remote sensing imagery as auxiliary data in forest inventory is based on the correlation between features extracted from the images and the ground truth. The bidirectional reflectance and radial displacement cause variation in image features located in different segments of the image but forest characteristics remaining the same. The variation has so far been diminished by different radiometric corrections. In this study the use of sun azimuth based converted image co-ordinates was examined to supplement auxiliary data extracted from digitised aerial photographs. The method was considered as an alternative for radiometric corrections. Additionally, the usefulness of multi-image interpretation of digitised aerial photographs in regression estimation of forest characteristics was studied. The state owned study area located in Leivonmäki, Central Finland and the study material consisted of five digitised and ortho-rectified colour-infrared (CIR) aerial photographs and field measurements of 388 plots, out of which 194 were relascope (Bitterlich) plots and 194 were concentric circular plots. Both the image data and the field measurements were from the year 1999. When examining the effect of the location of the image point on pixel values and texture features of Finnish forest plots in digitised CIR photographs the clearest differences were found between front-and back-lighted image halves. Inside the image half the differences between different blocks were clearly bigger on the front-lighted half than on the back-lighted half. The strength of the phenomenon varied by forest category. The differences between pixel values extracted from different image blocks were greatest in developed and mature stands and smallest in young stands. The differences between texture features were greatest in developing stands and smallest in young and mature stands. The logarithm of timber volume per hectare and the angular transformation of the proportion of broadleaved trees of the total volume were used as dependent variables in regression models. Five different converted image co-ordinates based trend surfaces were used in models in order to diminish the effect of the bidirectional reflectance. The reference model of total volume, in which the location of the image point had been ignored, resulted in RMSE of 1,268 calculated from test material. The best of the trend surfaces was the complete third order surface, which resulted in RMSE of 1,107. The reference model of the proportion of broadleaved trees resulted in RMSE of 0,4292 and the second order trend surface was the best, resulting in RMSE of 0,4270. The trend surface method is applicable, but it has to be applied by forest category and by variable. The usefulness of multi-image interpretation of digitised aerial photographs was studied by building comparable regression models using either the front-lighted image features, back-lighted image features or both. The two-image model turned out to be slightly better than the one-image models in total volume estimation. The best one-image model resulted in RMSE of 1,098 and the two-image model resulted in RMSE of 1,090. The homologous features did not improve the models of the proportion of broadleaved trees. The overall result gives motivation for further research of multi-image interpretation. The focus may be improving regression estimation and feature selection or examination of stratification used in two-phase sampling inventory techniques. Keywords: forest inventory, digitised aerial photograph, bidirectional reflectance, converted image coordinates, regression estimation, multi-image interpretation, pixel value, texture, trend surface
Resumo:
The paradigm of computational vision hypothesizes that any visual function -- such as the recognition of your grandparent -- can be replicated by computational processing of the visual input. What are these computations that the brain performs? What should or could they be? Working on the latter question, this dissertation takes the statistical approach, where the suitable computations are attempted to be learned from the natural visual data itself. In particular, we empirically study the computational processing that emerges from the statistical properties of the visual world and the constraints and objectives specified for the learning process. This thesis consists of an introduction and 7 peer-reviewed publications, where the purpose of the introduction is to illustrate the area of study to a reader who is not familiar with computational vision research. In the scope of the introduction, we will briefly overview the primary challenges to visual processing, as well as recall some of the current opinions on visual processing in the early visual systems of animals. Next, we describe the methodology we have used in our research, and discuss the presented results. We have included some additional remarks, speculations and conclusions to this discussion that were not featured in the original publications. We present the following results in the publications of this thesis. First, we empirically demonstrate that luminance and contrast are strongly dependent in natural images, contradicting previous theories suggesting that luminance and contrast were processed separately in natural systems due to their independence in the visual data. Second, we show that simple cell -like receptive fields of the primary visual cortex can be learned in the nonlinear contrast domain by maximization of independence. Further, we provide first-time reports of the emergence of conjunctive (corner-detecting) and subtractive (opponent orientation) processing due to nonlinear projection pursuit with simple objective functions related to sparseness and response energy optimization. Then, we show that attempting to extract independent components of nonlinear histogram statistics of a biologically plausible representation leads to projection directions that appear to differentiate between visual contexts. Such processing might be applicable for priming, \ie the selection and tuning of later visual processing. We continue by showing that a different kind of thresholded low-frequency priming can be learned and used to make object detection faster with little loss in accuracy. Finally, we show that in a computational object detection setting, nonlinearly gain-controlled visual features of medium complexity can be acquired sequentially as images are encountered and discarded. We present two online algorithms to perform this feature selection, and propose the idea that for artificial systems, some processing mechanisms could be selectable from the environment without optimizing the mechanisms themselves. In summary, this thesis explores learning visual processing on several levels. The learning can be understood as interplay of input data, model structures, learning objectives, and estimation algorithms. The presented work adds to the growing body of evidence showing that statistical methods can be used to acquire intuitively meaningful visual processing mechanisms. The work also presents some predictions and ideas regarding biological visual processing.
Resumo:
Aptitude-based student selection: A study concerning the admission processes of some technically oriented healthcare degree programmes in Finland (Orthotics and Prosthetics, Dental Technology and Optometry). The data studied consisted of conveniencesamples of preadmission information and the results of the admission processes of three technically oriented healthcare degree programmes (Orthotics and Prosthetics, Dental Technology and Optometry) in Finland during the years 1977-1986 and 2003. The number of the subjects tested and interviewed in the first samples was 191, 615 and 606, and in the second 67, 64 and 89, respectively. The questions of the six studies were: I. How were different kinds of preadmission data related to each other? II. Which were the major determinants of the admission decisions? III. Did the graduated students and those who dropped out differ from each other? IV. Was it possible to predict how well students would perform in the programmes? V. How was the student selection executed in the year 2003? VI. Should clinical vs. statistical prediction or both be used? (Some remarks are presented on Meehl's argument: "Always, we might as well face it, the shadow of the statistician hovers in the background; always the actuary will have the final word.") The main results of the study were as follows: Ability tests, dexterity tests and judgements of personality traits (communication skills, initiative, stress tolerance and motivation) provided unique, non-redundant information about the applicants. Available demographic variables did not bias the judgements of personality traits. In all three programme settings, four-factor solutions (personality, reasoning, gender-technical and age-vocational with factor scores) could be extracted by the Maximum Likelihood method with graphical Varimax rotation. The personality factor dominated the final aptitude judgements and very strongly affected the selection decisions. There were no clear differences between graduated students and those who had dropped out in regard to the four factors. In addition, the factor scores did not predict how well the students performed in the programmes. Meehl's argument on the uncertainty of clinical prediction was supported by the results, which on the other hand did not provide any relevant data for rules on statistical prediction. No clear arguments for or against the aptitude-based student selection was presented. However, the structure of the aptitude measures and their impact on the admission process are now better known. The concept of "personal aptitude" is not necessarily included in the values and preferences of those in charge of organizing the schooling. Thus, obviously the most well-founded and cost-effective way to execute student selection is to rely on e.g. the grade point averages of the matriculation examination and/or written entrance exams. This procedure, according to the present study, would result in a student group which has a quite different makeup (60%) from the group selected on the basis of aptitude tests. For the recruiting organizations, instead, "personal aptitude" may be a matter of great importance. The employers, of course, decide on personnel selection. The psychologists, if consulted, are responsible for the proper use of psychological measures.
Resumo:
In this dissertation, I present an overall methodological framework for studying linguistic alternations, focusing specifically on lexical variation in denoting a single meaning, that is, synonymy. As the practical example, I employ the synonymous set of the four most common Finnish verbs denoting THINK, namely ajatella, miettiä, pohtia and harkita ‘think, reflect, ponder, consider’. As a continuation to previous work, I describe in considerable detail the extension of statistical methods from dichotomous linguistic settings (e.g., Gries 2003; Bresnan et al. 2007) to polytomous ones, that is, concerning more than two possible alternative outcomes. The applied statistical methods are arranged into a succession of stages with increasing complexity, proceeding from univariate via bivariate to multivariate techniques in the end. As the central multivariate method, I argue for the use of polytomous logistic regression and demonstrate its practical implementation to the studied phenomenon, thus extending the work by Bresnan et al. (2007), who applied simple (binary) logistic regression to a dichotomous structural alternation in English. The results of the various statistical analyses confirm that a wide range of contextual features across different categories are indeed associated with the use and selection of the selected think lexemes; however, a substantial part of these features are not exemplified in current Finnish lexicographical descriptions. The multivariate analysis results indicate that the semantic classifications of syntactic argument types are on the average the most distinctive feature category, followed by overall semantic characterizations of the verb chains, and then syntactic argument types alone, with morphological features pertaining to the verb chain and extra-linguistic features relegated to the last position. In terms of overall performance of the multivariate analysis and modeling, the prediction accuracy seems to reach a ceiling at a Recall rate of roughly two-thirds of the sentences in the research corpus. The analysis of these results suggests a limit to what can be explained and determined within the immediate sentential context and applying the conventional descriptive and analytical apparatus based on currently available linguistic theories and models. The results also support Bresnan’s (2007) and others’ (e.g., Bod et al. 2003) probabilistic view of the relationship between linguistic usage and the underlying linguistic system, in which only a minority of linguistic choices are categorical, given the known context – represented as a feature cluster – that can be analytically grasped and identified. Instead, most contexts exhibit degrees of variation as to their outcomes, resulting in proportionate choices over longer stretches of usage in texts or speech.
Resumo:
The earliest stages of human cortical visual processing can be conceived as extraction of local stimulus features. However, more complex visual functions, such as object recognition, require integration of multiple features. Recently, neural processes underlying feature integration in the visual system have been under intensive study. A specialized mid-level stage preceding the object recognition stage has been proposed to account for the processing of contours, surfaces and shapes as well as configuration. This thesis consists of four experimental, psychophysical studies on human visual feature integration. In two studies, classification image a recently developed psychophysical reverse correlation method was used. In this method visual noise is added to near-threshold stimuli. By investigating the relationship between random features in the noise and observer s perceptual decision in each trial, it is possible to estimate what features of the stimuli are critical for the task. The method allows visualizing the critical features that are used in a psychophysical task directly as a spatial correlation map, yielding an effective "behavioral receptive field". Visual context is known to modulate the perception of stimulus features. Some of these interactions are quite complex, and it is not known whether they reflect early or late stages of perceptual processing. The first study investigated the mechanisms of collinear facilitation, where nearby collinear Gabor flankers increase the detectability of a central Gabor. The behavioral receptive field of the mechanism mediating the detection of the central Gabor stimulus was measured by the classification image method. The results show that collinear flankers increase the extent of the behavioral receptive field for the central Gabor, in the direction of the flankers. The increased sensitivity at the ends of the receptive field suggests a low-level explanation for the facilitation. The second study investigated how visual features are integrated into percepts of surface brightness. A novel variant of the classification image method with brightness matching task was used. Many theories assume that perceived brightness is based on the analysis of luminance border features. Here, for the first time this assumption was directly tested. The classification images show that the perceived brightness of both an illusory Craik-O Brien-Cornsweet stimulus and a real uniform step stimulus depends solely on the border. Moreover, the spatial tuning of the features remains almost constant when the stimulus size is changed, suggesting that brightness perception is based on the output of a single spatial frequency channel. The third and fourth studies investigated global form integration in random-dot Glass patterns. In these patterns, a global form can be immediately perceived, if even a small proportion of random dots are paired to dipoles according to a geometrical rule. In the third study the discrimination of orientation structure in highly coherent concentric and Cartesian (straight) Glass patterns was measured. The results showed that the global form was more efficiently discriminated in concentric patterns. The fourth study investigated how form detectability depends on the global regularity of the Glass pattern. The local structure was either Cartesian or curved. It was shown that randomizing the local orientation deteriorated the performance only with the curved pattern. The results give support for the idea that curved and Cartesian patterns are processed in at least partially separate neural systems.
Resumo:
One major reason for the global decline of biodiversity is habitat loss and fragmentation. Conservation areas can be designed to reduce biodiversity loss, but as resources are limited, conservation efforts need to be prioritized in order to achieve best possible outcomes. The field of systematic conservation planning developed as a response to opportunistic approaches to conservation that often resulted in biased representation of biological diversity. The last two decades have seen the development of increasingly sophisticated methods that account for information about biodiversity conservation goals (benefits), economical considerations (costs) and socio-political constraints. In this thesis I focus on two general topics related to systematic conservation planning. First, I address two aspects of the question about how biodiversity features should be valued. (i) I investigate the extremely important but often neglected issue of differential prioritization of species for conservation. Species prioritization can be based on various criteria, and is always goal-dependent, but can also be implemented in a scientifically more rigorous way than what is the usual practice. (ii) I introduce a novel framework for conservation prioritization, which is based on continuous benefit functions that convert increasing levels of biodiversity feature representation to increasing conservation value using the principle that more is better. Traditional target-based systematic conservation planning is a special case of this approach, in which a step function is used for the benefit function. We have further expanded the benefit function framework for area prioritization to address issues such as protected area size and habitat vulnerability. In the second part of the thesis I address the application of community level modelling strategies to conservation prioritization. One of the most serious issues in systematic conservation planning currently is not the deficiency of methodology for selection and design, but simply the lack of data. Community level modelling offers a surrogate strategy that makes conservation planning more feasible in data poor regions. We have reviewed the available community-level approaches to conservation planning. These range from simplistic classification techniques to sophisticated modelling and selection strategies. We have also developed a general and novel community level approach to conservation prioritization that significantly improves on methods that were available before. This thesis introduces further degrees of realism into conservation planning methodology. The benefit function -based conservation prioritization framework largely circumvents the problematic phase of target setting, and allowing for trade-offs between species representation provides a more flexible and hopefully more attractive approach to conservation practitioners. The community-level approach seems highly promising and should prove valuable for conservation planning especially in data poor regions. Future work should focus on integrating prioritization methods to deal with multiple aspects in combination influencing the prioritization process, and further testing and refining the community level strategies using real, large datasets.