We propose a joint representation and classification framework that achieves the dual goal of finding the most discriminative sparse overcomplete encoding and optimal classifier parameters. Formulating an optimization problem that combines the objective function of the classification with the representation error of both labeled and unlabeled data, constrained by sparsity, we propose an algorithm that alternates between solving for subsets of parameters, whilst preserving the sparsity. The method is then evaluated over two important classification problems in computer vision: object categorization of natural images using the Caltech 101 database and face recognition using the Extended Yale B face database. The results show that the proposed method is competitive against other recently proposed sparse overcomplete counterparts and considerably outperforms many recently proposed face recognition techniques when the number training samples is small.


In this paper the author traces the possibilities afforded by engaging with the aesthetic, historic and socio-political nature of shodo (Japanese calligraphy) as an intersectional space. Shodo literally translated as 'the way of writing' is an artistic practice bringing together ink, brush and paper. It is simultaneously a juncture between studied discipline and an ongoing mediation of subjectivities. The calligrapher/writer/drawer communicates to the reader through the bold or subtle brush strokes, the pressure and movement at the completion of each stroke. The calligrapher/writer/drawer draws across the boundaries of text and image to meet the reader blurring the lines between subject and object. This discussion re-examines the hierarchical binaries of writing/drawing, text/image, self/Other as they play out from vanishing lines of distinction between truth and conjecture. Crossing these binaries opens up opportunity for decentring and questioning representational practice by enabling other possible meanings and practices to emerge (Lather, 2007). I work from a stance of theoretical promiscuity in order to disrupt constitutive discourses and restore the liminal in social research. Drawing across the fragments of research projects I illustrate the generative and speculative space of visualising pedascapes in educational research.


Virtual reality and simulation are becoming increasingly important in modern society and it is essential to improve our understanding of system usability and efficacy from the users’ perspective. This paper introduces a novel evaluation method designed to assess human user capability when undertaking technical and procedural training using virtual training systems. The evaluation method falls under the user-centred design and evaluation paradigm and draws on theories of cognitive, skillbased and affective learning outcomes. The method focuses on user interaction with haptic-audio-visual interfaces and the complexities related to variability in users’ performance, and the adoption and acceptance of the technologies. A large scale user study focusing on object assembly training tasks involving selecting, rotating, releasing, inserting and manipulating 3D objects was performed. The study demonstrated the advantages of the method in obtaining valuable multimodal information for accurate and comprehensive evaluation of virtual training system efficacy. The study investigated how well users learn, perform, adapt to and perceive the virtual training. The results of the study revealed valuable aspects of the design and evaluation of virtual training systems contributing to an improved understanding of more usable virtual training systems.


Sparse representation has been introduced to address many recognition problems in computer vision. In this paper, we propose a new framework for object categorization based on sparse representation of local features. Unlike most of previous sparse coding based methods in object classification that only use sparse coding to extract high-level features, the proposed method incorporates sparse representation and classification into a unified framework. Therefore, it does not need a further classifier. Experimental results show that the proposed method achieved better or comparable accuracy than the well known bag-of-features representation with various classifiers.


We propose a framework for visual and haptic collaboration in X3D/VRML shared virtual spaces. In this collaborative framework, two pipelines— visual and haptic—complement each other to provide a simple and efficient solution to problem requiring collaboration in shared virtual spaces on the web. We consider shared objects defined as virtual object with their visual and physical properties rendered synchronously on each client computer. We introduce virtual tools which are shared objects associated with interactive and haptic devices. We implemented the proposed ideas as a server-client framework with a dedicated viewer. We discuss two implementation frameworks based on the strong and thin server concepts.


In this chapter we focus on face appearance-based biometrics. The cheap and readily available hardware used to acquire data, their non-invasiveness and the ease of employing them from a distance and without the awareness of the user, are just some of the reasons why these continue to be of great practical interest. However, a number of research challenges remain. Specifically, face biometrics have traditionally focused on images acquired in the visible light spectrum and these are greatly affected by such extrinsic factors such as the illumination, camera angle (or, equivalently, head pose) and occlusion. In practice, the effects of changing pose are usually least problematic and can oftentimes be overcome by acquiring data over a time period, e.g., by tracking a face in a surveillance video. Consequently, image sequence or image set matching has recently gained a lot of attention in the literature [137–139] and is the paradigm adopted in this chapter as well. In other words, we assume that the training image set for each individual contains some variability in pose, but is not obtained in scripted conditions or in controlled illumination. In contrast, illumination is much more difficult to deal with: the illumination setup is in most cases not practical to control and its physics is difficult to accurately model. Thermal spectrum imagery is useful in this regard as it is virtually insensitive to illumination changes, as illustrated in Fig. 6.1. On the other hand, it lacks much of the individual, discriminating facial detail contained in visual images. In this sense, the two modalities can be seen as complementing each other. The key idea behind the system presented in this chapter is that robustness to extreme illumination changes can be achieved by fusing the two. This paradigm will further prove useful when we consider the difficulty of recognition in the presence of occlusion caused by prescription glasses.


Recognition algorithms that use data obtained by imaging faces in the thermal spectrum are promising in achieving invariance to extreme illumination changes that are often present in practice. In this paper we analyze the performance of a recently proposed face recognition algorithm that combines visual and thermal modalities by decision level fusion. We examine (i) the effects of the proposed data preprocessing in each domain, (ii) the contribution to improved recognition of different types of features, (iii) the importance of prescription glasses detection, in the context of both 1-to-N and 1-to-1 matching (recognition vs. verification performance). Finally, we discuss the significance of our results and, in particular, identify a number of limitations of the current state-of-the-art and propose promising directions for future research.


High performance for face recognition systems occurs in controlled environments and degrades with variations in illumination, facial expression, and pose. Efforts have been made to explore alternate face modalities such as infrared (IR) and 3-D for face recognition. Studies also demonstrate that fusion of multiple face modalities improve performance as compared with singlemodal face recognition. This paper categorizes these algorithms into singlemodal and multimodal face recognition and evaluates methods within each category via detailed descriptions of representative work and summarizations in tables. Advantages and disadvantages of each modality for face recognition are analyzed. In addition, face databases and system evaluations are also covered.


Object segmentation is widely recognized as one of the most challenging problems in computer vision. One major problem of existing methods is that most of them are vulnerable to the cluttered background. Moreover, human intervention is often required to specify foreground/background priors, which restricts the usage of object segmentation in real-world scenario. To address these problems, we propose a novel approach to learn complementary saliency priors for foreground object segmentation in complex scenes. Different from existing saliency-based segmentation approaches, we propose to learn two complementary saliency maps that reveal the most reliable foreground and background regions. Given such priors, foreground object segmentation is formulated as a binary pixel labelling problem that can be efficiently solved using graph cuts. As such, the confident saliency priors can be utilized to extract the most salient objects and reduce the distraction of cluttered background. Extensive experiments show that our approach outperforms 16 state-of-the-art methods remarkably on three public image benchmarks.


O desenvolvimento de artefatos de software é um processo de engenharia, como todo processo de engenharia, envolve uma série de etapas que devem ser conduzidas através de uma metodologia apropriada. Para que um determinado software alcance seus objetivos, as características conceituais e arquiteturais devem ser bem definidas antes da implementação. Aplicações baseadas em hiperdocumentos possuem uma característica específica que é a definição de seus aspectos navegacionais. A navegação é uma etapa crítica no processo de definição de softwares baseados em hiperdocumentos, pois ela conduz o usuário durante uma sessão de visita ao conteúdo de um site. Uma falha no processo de especificação da navegação causa uma perda de contexto, desorientando o usuário no espaço da aplicação. Existem diversas metodologias para o tratamento das características de navegação de aplicações baseadas em hiperdocumentos. As principais metodologias encontradas na literatura foram estudadas e analisadas neste trabalho. Foi realizada uma análise comparativa entre as metodologias, traçando suas abordagens e etapas. O estudo das abordagens de especificação de hiperdocumentos foi uma etapa preliminar servindo como base de estudo para o objetivo deste trabalho. O foco é a construção de uma ferramenta gráfica de especificação conceitual de hiperdocumentos, segundo uma metodologia de modelagem de software baseado em hiperdocumentos. O método adotado foi o OOHDM (Object-Oriented Hypermedia Design Model), por cercar todas as etapas de um processo de desenvolvimento de aplicações, com uma atenção particular à navegação. A ferramenta implementa uma interface gráfica onde o usuário poderá modelar a aplicação através da criação de modelos. O processo de especificação compreende três modelos: modelagem conceitual, modelagem navegacional e de interface. As características da aplicação são definidas em um processo incremental, que começa na definição conceitual e finaliza nas características de interface. A ferramenta gera um protótipo da aplicação em XML. Para a apresentação das páginas em um navegador Web, utilizou-se XSLT para a conversão das informações no formato XML para HTML. Os modelos criados através das etapas de especificação abstrata da aplicação são exportados em OOHDM-ML. Um estudo de caso foi implementado para validação da ferramenta. Como principal contribuição deste trabalho, pode-se citar a construção de um ambiente gráfico de especificação abstrata de hiperdocumentos e um ambiente de implementação de protótipos e exportação de modelos. Com isso, pretende-se orientar, conduzir e disciplinar o trabalho do usuário durante o processo de especificação de aplicações.


Ornamental fish may be severely affected by a stressful environment. Stressors impair the immune response, reproduction and growth rate; thus, the identification of possible stressors will aid to improve the overall quality of ornamental fish. The aim of this study was to determine whole-body cortisol of adult zebrafish, Danio rerio, following visual or direct contact with a predator species. Zebrafish were distributed in three groups: the first group, which consisted of zebrafish reared completely isolated of the predator, was considered the negative control; the second group, in which the predator, Parachromis managuensis was stocked together with zebrafish, was considered the positive control; the third group consisted of zebrafish stocked in a glass aquarium, with direct visual contact with the predator. The mean whole-body cortisol concentration in zebrafish from the negative control was 6.78 +/- 1.12 ng g(-1), a concentration statistically lower than that found in zebrafish having visual contact with the predator (9.26 +/- 0.88 ng g(-1)) which, in turn, was statistically lower than the mean whole-body cortisol of the positive control group (12.35 +/- 1.59 ng g(-1)). The higher whole-body cortisol concentration found in fish from the positive control can be attributed to the detection, by the zebrafish, of relevant risk situations that may involve a combination of chemical, olfactory and visual cues. One of the functions of elevated cortisol is to mobilize energy from body resources to cope with stress. The elevation of whole-body cortisol in fish subjected to visual contact with the predator involves only the visual cue in the recognition of predation risk. We hypothesized that the zebrafish could recognize predator characteristics in P managuensis, such as length, shape, color and behavior. Nonetheless, the elevation of whole-body cortisol in zebrafish suggested that the visual contact of the predator may elicit a stress response in prey fish. This assertion has a strong practical application concerning the species distribution in ornamental fish markets in which prey species should not be allowed to see predator species. Minimizing visual contact between prey and predator fish may improve the quality, viability and welfare of small fish in ornamental fish markets. (c) 2007 Elsevier B.V. All rights reserved.


Visual attention is a very important task in autonomous robotics, but, because of its complexity, the processing time required is significant. We propose an architecture for feature selection using foveated images that is guided by visual attention tasks and that reduces the processing time required to perform these tasks. Our system can be applied in bottom-up or top-down visual attention. The foveated model determines which scales are to be used on the feature extraction algorithm. The system is able to discard features that are not extremely necessary for the tasks, thus, reducing the processing time. If the fovea is correctly placed, then it is possible to reduce the processing time without compromising the quality of the tasks outputs. The distance of the fovea from the object is also analyzed. If the visual system loses the tracking in top-down attention, basic strategies of fovea placement can be applied. Experiments have shown that it is possible to reduce up to 60% the processing time with this approach. To validate the method, we tested it with the feature algorithm known as Speeded Up Robust Features (SURF), one of the most efficient approaches for feature extraction. With the proposed architecture, we can accomplish real time requirements of robotics vision, mainly to be applied in autonomous robotics