3 resultados para visual process
em Helda - Digital Repository of University of Helsinki
Resumo:
The purpose of this study is to find a framework for a holistic approach to, and form a conceptual toolbox for, investigating changes in signs and in their interpretation. Charles S. Peirce s theory of signs in a communicative perspective is taken as a basis for the framework. The concern directing the study is the problem of a missing framework in analysing signs of visual artefacts from a holistic perspective as well as that of the missing conceptual tools. To discover the possibility of such a holistic approach to semiosic processes and to form a conceptual toolbox the following issues are discussed: i) how the many Objects with two aspects involved in Peirce s definition of sign-action, promote multiple semiosis arising from the same sign by the same Interpretant depending on the domination of the Objects; ii) in which way can the relation of the individual and society or group be made more apparent in the construction of the self since this construction is intertwined with the process of meaning-creation and interpretation; iii) how to account for the fundamental role of emotions in semiosis, and the relation of emotions with the often neglected topic of embodiment; iv) how to take into account the dynamic, mediating and processual nature of sign-action in analysing and understanding the changes in signs and in the interpretation of signs. An interdisciplinary approach is chosen for this dissertation. Concepts that developed within social psychology, developmental psychology, neurosciences and semiotics, are discussed. The common aspect of the approaches is that they in one way or another concentrate on mediation provided by signs in explaining human activity and cognition. The holistic approach and conceptual toolbox found are employed in a case study. This consists of an analysis of beer brands including a comparison of brands from two different cultures. It becomes clear that different theories and approaches have mutual affinities and do complement each other. In addition, the affinities in different disciplines somewhat provide credence to the various views. From the combined approach described, it becomes apparent that by the semiosic process, the emerging semiotic self intertwined with the Umwelt, including emotions, can be described. Seeing the interpretation and meaning-making through semiosis allows for the analysis of groups, taking into account the embodied and emotional component. It is concluded that emotions have a crucial role in all human activity, including so-called reflective thinking, and that emotions and embodiment should be consciously taken into account in analysing signs, the interpretation, and in changes of signs and interpretations from both the social and individual level. The analysis of the beer labels expresses well the intertwined nature of the relationship between signs, individual consumers and society. Many direct influences from society on the label design are found, and also some indirect attitude changes that become apparent from magazines, company reports, etc. In addition, the analysis brings up the issues of the unifying tendency of the visual artefacts of different cultures, but also demonstrates that the visual artefacts are able to hold the local signs and meanings, and sometimes are able to represent the local meanings although the signs have changed in the unifying process.
Resumo:
In visual object detection and recognition, classifiers have two interesting characteristics: accuracy and speed. Accuracy depends on the complexity of the image features and classifier decision surfaces. Speed depends on the hardware and the computational effort required to use the features and decision surfaces. When attempts to increase accuracy lead to increases in complexity and effort, it is necessary to ask how much are we willing to pay for increased accuracy. For example, if increased computational effort implies quickly diminishing returns in accuracy, then those designing inexpensive surveillance applications cannot aim for maximum accuracy at any cost. It becomes necessary to find trade-offs between accuracy and effort. We study efficient classification of images depicting real-world objects and scenes. Classification is efficient when a classifier can be controlled so that the desired trade-off between accuracy and effort (speed) is achieved and unnecessary computations are avoided on a per input basis. A framework is proposed for understanding and modeling efficient classification of images. Classification is modeled as a tree-like process. In designing the framework, it is important to recognize what is essential and to avoid structures that are narrow in applicability. Earlier frameworks are lacking in this regard. The overall contribution is two-fold. First, the framework is presented, subjected to experiments, and shown to be satisfactory. Second, certain unconventional approaches are experimented with. This allows the separation of the essential from the conventional. To determine if the framework is satisfactory, three categories of questions are identified: trade-off optimization, classifier tree organization, and rules for delegation and confidence modeling. Questions and problems related to each category are addressed and empirical results are presented. For example, related to trade-off optimization, we address the problem of computational bottlenecks that limit the range of trade-offs. We also ask if accuracy versus effort trade-offs can be controlled after training. For another example, regarding classifier tree organization, we first consider the task of organizing a tree in a problem-specific manner. We then ask if problem-specific organization is necessary.
Resumo:
The paradigm of computational vision hypothesizes that any visual function -- such as the recognition of your grandparent -- can be replicated by computational processing of the visual input. What are these computations that the brain performs? What should or could they be? Working on the latter question, this dissertation takes the statistical approach, where the suitable computations are attempted to be learned from the natural visual data itself. In particular, we empirically study the computational processing that emerges from the statistical properties of the visual world and the constraints and objectives specified for the learning process. This thesis consists of an introduction and 7 peer-reviewed publications, where the purpose of the introduction is to illustrate the area of study to a reader who is not familiar with computational vision research. In the scope of the introduction, we will briefly overview the primary challenges to visual processing, as well as recall some of the current opinions on visual processing in the early visual systems of animals. Next, we describe the methodology we have used in our research, and discuss the presented results. We have included some additional remarks, speculations and conclusions to this discussion that were not featured in the original publications. We present the following results in the publications of this thesis. First, we empirically demonstrate that luminance and contrast are strongly dependent in natural images, contradicting previous theories suggesting that luminance and contrast were processed separately in natural systems due to their independence in the visual data. Second, we show that simple cell -like receptive fields of the primary visual cortex can be learned in the nonlinear contrast domain by maximization of independence. Further, we provide first-time reports of the emergence of conjunctive (corner-detecting) and subtractive (opponent orientation) processing due to nonlinear projection pursuit with simple objective functions related to sparseness and response energy optimization. Then, we show that attempting to extract independent components of nonlinear histogram statistics of a biologically plausible representation leads to projection directions that appear to differentiate between visual contexts. Such processing might be applicable for priming, \ie the selection and tuning of later visual processing. We continue by showing that a different kind of thresholded low-frequency priming can be learned and used to make object detection faster with little loss in accuracy. Finally, we show that in a computational object detection setting, nonlinearly gain-controlled visual features of medium complexity can be acquired sequentially as images are encountered and discarded. We present two online algorithms to perform this feature selection, and propose the idea that for artificial systems, some processing mechanisms could be selectable from the environment without optimizing the mechanisms themselves. In summary, this thesis explores learning visual processing on several levels. The learning can be understood as interplay of input data, model structures, learning objectives, and estimation algorithms. The presented work adds to the growing body of evidence showing that statistical methods can be used to acquire intuitively meaningful visual processing mechanisms. The work also presents some predictions and ideas regarding biological visual processing.