996 resultados para sViewpoint Invariant Detection
Resumo:
The lack of viable methods to map and label existing infrastructure is one of the engineering grand challenges for the 21st century. For instance, over two thirds of the effort needed to geometrically model even simple infrastructure is spent on manually converting a cloud of points to a 3D model. The result is that few facilities today have a complete record of as-built information and that as-built models are not produced for the vast majority of new construction and retrofit projects. This leads to rework and design changes that can cost up to 10% of the installed costs. Automatically detecting building components could address this challenge. However, existing methods for detecting building components are not view and scale-invariant, or have only been validated in restricted scenarios that require a priori knowledge without considering occlusions. This leads to their constrained applicability in complex civil infrastructure scenes. In this paper, we test a pose-invariant method of labeling existing infrastructure. This method simultaneously detects objects and estimates their poses. It takes advantage of a recent novel formulation for object detection and customizes it to generic civil infrastructure scenes. Our preliminary experiments demonstrate that this method achieves convincing recognition results.
Resumo:
Compared with other existing methods, the feature point-based image watermarking schemes can resist to global geometric attacks and local geometric attacks, especially cropping and random bending attacks (RBAs), by binding watermark synchronization with salient image characteristics. However, the watermark detection rate remains low in the current feature point-based watermarking schemes. The main reason is that both of feature point extraction and watermark embedding are more or less related to the pixel position, which is seriously distorted by the interpolation error and the shift problem during geometric attacks. In view of these facts, this paper proposes a geometrically robust image watermarking scheme based on local histogram. Our scheme mainly consists of three components: (1) feature points extraction and local circular regions (LCRs) construction are conducted by using Harris-Laplace detector; (2) a mechanism of grapy theoretical clustering-based feature selection is used to choose a set of non-overlapped LCRs, then geometrically invariant LCRs are completely formed through dominant orientation normalization; and (3) the histogram and mean statistically independent of the pixel position are calculated over the selected LCRs and utilized to embed watermarks. Experimental results demonstrate that the proposed scheme can provide sufficient robustness against geometric attacks as well as common image processing operations. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The problem of automatic face recognition is to visually identify a person in an input image. This task is performed by matching the input face against the faces of known people in a database of faces. Most existing work in face recognition has limited the scope of the problem, however, by dealing primarily with frontal views, neutral expressions, and fixed lighting conditions. To help generalize existing face recognition systems, we look at the problem of recognizing faces under a range of viewpoints. In particular, we consider two cases of this problem: (i) many example views are available of each person, and (ii) only one view is available per person, perhaps a driver's license or passport photograph. Ideally, we would like to address these two cases using a simple view-based approach, where a person is represented in the database by using a number of views on the viewing sphere. While the view-based approach is consistent with case (i), for case (ii) we need to augment the single real view of each person with synthetic views from other viewpoints, views we call 'virtual views'. Virtual views are generated using prior knowledge of face rotation, knowledge that is 'learned' from images of prototype faces. This prior knowledge is used to effectively rotate in depth the single real view available of each person. In this thesis, I present the view-based face recognizer, techniques for synthesizing virtual views, and experimental results using real and virtual views in the recognizer.
Resumo:
End-stopped cells in cortical area V1, which combine out- puts of complex cells tuned to different orientations, serve to detect line and edge crossings (junctions) and points with a large curvature. In this paper we study the importance of the multi-scale keypoint representa- tion, i.e. retinotopic keypoint maps which are tuned to different spatial frequencies (scale or Level-of-Detail). We show that this representation provides important information for Focus-of-Attention (FoA) and object detection. In particular, we show that hierarchically-structured saliency maps for FoA can be obtained, and that combinations over scales in conjunction with spatial symmetries can lead to face detection through grouping operators that deal with keypoints at the eyes, nose and mouth, especially when non-classical receptive field inhibition is employed. Al- though a face detector can be based on feedforward and feedback loops within area V1, such an operator must be embedded into dorsal and ventral data streams to and from higher areas for obtaining translation-, rotation- and scale-invariant face (object) detection.
Resumo:
Le mouvement de la marche est un processus essentiel de l'activité humaine et aussi le résultat de nombreuses interactions collaboratives entre les systèmes neurologiques, articulaires et musculo-squelettiques fonctionnant ensemble efficacement. Ceci explique pourquoi une analyse de la marche est aujourd'hui de plus en plus utilisée pour le diagnostic (et aussi la prévention) de différents types de maladies (neurologiques, musculaires, orthopédique, etc.). Ce rapport présente une nouvelle méthode pour visualiser rapidement les différentes parties du corps humain liées à une possible asymétrie (temporellement invariante par translation) existant dans la démarche d'un patient pour une possible utilisation clinique quotidienne. L'objectif est de fournir une méthode à la fois facile et peu dispendieuse permettant la mesure et l'affichage visuel, d'une manière intuitive et perceptive, des différentes parties asymétriques d'une démarche. La méthode proposée repose sur l'utilisation d'un capteur de profondeur peu dispendieux (la Kinect) qui est très bien adaptée pour un diagnostique rapide effectué dans de petites salles médicales car ce capteur est d'une part facile à installer et ne nécessitant aucun marqueur. L'algorithme que nous allons présenter est basé sur le fait que la marche saine possède des propriétés de symétrie (relativement à une invariance temporelle) dans le plan coronal.
Resumo:
Colour segmentation is the most commonly used method in road signs detection. Road sign contains several basic colours such as red, yellow, blue and white which depends on countries.The objective of this thesis is to do an evaluation of the four colour segmentation algorithms. Dynamic Threshold Algorithm, A Modification of de la Escalera’s Algorithm, the Fuzzy Colour Segmentation Algorithm and Shadow and Highlight Invariant Algorithm. The processing time and segmentation success rate as criteria are used to compare the performance of the four algorithms. And red colour is selected as the target colour to complete the comparison. All the testing images are selected from the Traffic Signs Database of Dalarna University [1] randomly according to the category. These road sign images are taken from a digital camera mounted in a moving car in Sweden.Experiments show that the Fuzzy Colour Segmentation Algorithm and Shadow and Highlight Invariant Algorithm are more accurate and stable to detect red colour of road signs. And the method could also be used in other colours analysis research. The yellow colour which is chosen to evaluate the performance of the four algorithms can reference Master Thesis of Yumei Liu.
Resumo:
AIRES, Kelson R. T. ; ARAÚJO, Hélder J. ; MEDEIROS, Adelardo A. D. . Plane Detection from Monocular Image Sequences. In: VISUALIZATION, IMAGING AND IMAGE PROCESSING, 2008, Palma de Mallorca, Spain. Proceedings..., Palma de Mallorca: VIIP, 2008
Resumo:
AIRES, Kelson R. T.; ARAÚJO, Hélder J.; MEDEIROS, Adelardo A. D. Plane Detection Using Affine Homography. In: CONGRESSO BRASILEIRO DE AUTOMÁTICA, 2008, Juiz de Fora, MG: Anais... do CBA 2008.
Resumo:
Craniosynostosis consists of a premature fusion of the sutures in an infant skull that restricts skull and brain growth. During the last decades, there has been a rapid increase of fundamentally diverse surgical treatment methods. At present, the surgical outcome has been assessed using global variables such as cephalic index, head circumference, and intracranial volume. However, these variables have failed in describing the local deformations and morphological changes that may have a role in the neurologic disorders observed in the patients. This report describes a rigid image registration-based method to evaluate outcomes of craniosynostosis surgical treatments, local quantification of head growth, and indirect intracranial volume change measurements. The developed semiautomatic analysis method was applied to computed tomography data sets of a 5-month-old boy with sagittal craniosynostosis who underwent expansion of the posterior skull with cranioplasty. Quantification of the local changes between pre- and postoperative images was quantified by mapping the minimum distance of individual points from the preoperative to the postoperative surface meshes, and indirect intracranial volume changes were estimated. The proposed methodology can provide the surgeon a tool for the quantitative evaluation of surgical procedures and detection of abnormalities of the infant skull and its development.
Resumo:
Derivation of probability estimates complementary to geophysical data sets has gained special attention over the last years. Information about a confidence level of provided physical quantities is required to construct an error budget of higher-level products and to correctly interpret final results of a particular analysis. Regarding the generation of products based on satellite data a common input consists of a cloud mask which allows discrimination between surface and cloud signals. Further the surface information is divided between snow and snow-free components. At any step of this discrimination process a misclassification in a cloud/snow mask propagates to higher-level products and may alter their usability. Within this scope a novel probabilistic cloud mask (PCM) algorithm suited for the 1 km × 1 km Advanced Very High Resolution Radiometer (AVHRR) data is proposed which provides three types of probability estimates between: cloudy/clear-sky, cloudy/snow and clear-sky/snow conditions. As opposed to the majority of available techniques which are usually based on the decision-tree approach in the PCM algorithm all spectral, angular and ancillary information is used in a single step to retrieve probability estimates from the precomputed look-up tables (LUTs). Moreover, the issue of derivation of a single threshold value for a spectral test was overcome by the concept of multidimensional information space which is divided into small bins by an extensive set of intervals. The discrimination between snow and ice clouds and detection of broken, thin clouds was enhanced by means of the invariant coordinate system (ICS) transformation. The study area covers a wide range of environmental conditions spanning from Iceland through central Europe to northern parts of Africa which exhibit diverse difficulties for cloud/snow masking algorithms. The retrieved PCM cloud classification was compared to the Polar Platform System (PPS) version 2012 and Moderate Resolution Imaging Spectroradiometer (MODIS) collection 6 cloud masks, SYNOP (surface synoptic observations) weather reports, Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) vertical feature mask version 3 and to MODIS collection 5 snow mask. The outcomes of conducted analyses proved fine detection skills of the PCM method with results comparable to or better than the reference PPS algorithm.
Resumo:
This paper describes the participation of DAEDALUS at ImageCLEF 2011 Plant Identification task. The task is evaluated as a supervised classification problem over 71 tree species from the French Mediterranean area used as class labels, based on visual content from scan, scan-like and natural photo images. Our approach to this task is to build a classifier based on the detection of keypoints from the images extracted using Lowe’s Scale Invariant Feature Transform (SIFT) algorithm. Although our overall classification score is very low as compared to other participant groups, the main conclusion that can be drawn is that SIFT keypoints seem to work significantly better for photos than for the other image types, so our approach may be a feasible strategy for the classification of this kind of visual content.
Cross-orientation masking is speed invariant between ocular pathways but speed dependent within them
Resumo:
In human (D. H. Baker, T. S. Meese, & R. J. Summers, 2007b) and in cat (B. Li, M. R. Peterson, J. K. Thompson, T. Duong, & R. D. Freeman, 2005; F. Sengpiel & V. Vorobyov, 2005) there are at least two routes to cross-orientation suppression (XOS): a broadband, non-adaptable, monocular (within-eye) pathway and a more narrowband, adaptable interocular (between the eyes) pathway. We further characterized these two routes psychophysically by measuring the weight of suppression across spatio-temporal frequency for cross-oriented pairs of superimposed flickering Gabor patches. Masking functions were normalized to unmasked detection thresholds and fitted by a two-stage model of contrast gain control (T. S. Meese, M. A. Georgeson, & D. H. Baker, 2006) that was developed to accommodate XOS. The weight of monocular suppression was a power function of the scalar quantity ‘speed’ (temporal-frequency/spatial-frequency). This weight can be expressed as the ratio of non-oriented magno- and parvo-like mechanisms, permitting a fast-acting, early locus, as befits the urgency for action associated with high retinal speeds. In contrast, dichoptic-masking functions superimposed. Overall, this (i) provides further evidence for dissociation between the two forms of XOS in humans, and (ii) indicates that the monocular and interocular varieties of XOS are space/time scale-dependent and scale-invariant, respectively. This suggests an image-processing role for interocular XOS that is tailored to natural image statistics—very different from that of the scale-dependent (speed-dependent) monocular variety.
Resumo:
Flow Cytometry analyzers have become trusted companions due to their ability to perform fast and accurate analyses of human blood. The aim of these analyses is to determine the possible existence of abnormalities in the blood that have been correlated with serious disease states, such as infectious mononucleosis, leukemia, and various cancers. Though these analyzers provide important feedback, it is always desired to improve the accuracy of the results. This is evidenced by the occurrences of misclassifications reported by some users of these devices. It is advantageous to provide a pattern interpretation framework that is able to provide better classification ability than is currently available. Toward this end, the purpose of this dissertation was to establish a feature extraction and pattern classification framework capable of providing improved accuracy for detecting specific hematological abnormalities in flow cytometric blood data. ^ This involved extracting a unique and powerful set of shift-invariant statistical features from the multi-dimensional flow cytometry data and then using these features as inputs to a pattern classification engine composed of an artificial neural network (ANN). The contribution of this method consisted of developing a descriptor matrix that can be used to reliably assess if a donor’s blood pattern exhibits a clinically abnormal level of variant lymphocytes, which are blood cells that are potentially indicative of disorders such as leukemia and infectious mononucleosis. ^ This study showed that the set of shift-and-rotation-invariant statistical features extracted from the eigensystem of the flow cytometric data pattern performs better than other commonly-used features in this type of disease detection, exhibiting an accuracy of 80.7%, a sensitivity of 72.3%, and a specificity of 89.2%. This performance represents a major improvement for this type of hematological classifier, which has historically been plagued by poor performance, with accuracies as low as 60% in some cases. This research ultimately shows that an improved feature space was developed that can deliver improved performance for the detection of variant lymphocytes in human blood, thus providing significant utility in the realm of suspect flagging algorithms for the detection of blood-related diseases.^
Resumo:
Classification procedures, including atmospheric correction satellite images as well as classification performance utilizing calibration and validation at different levels, have been investigated in the context of a coarse land-cover classification scheme for the Pachitea Basin. Two different correction methods were tested against no correction in terms of reflectance correction towards a common response for pseudo-invariant features (PIF). The accuracy of classifications derived from each of the three methods was then assessed in a discriminant analysis using crossvalidation at pixel, polygon, region, and image levels. Results indicate that only regression adjusted images using PIFs show no significant difference between images in any of the bands. A comparison of classifications at different levels suggests though that at pixel, polygon, and region levels the accuracy of the classifications do not significantly differ between corrected and uncorrected images. Spatial patterns of land-cover were analyzed in terms of colonization history, infrastructure, suitability of the land, and landownership. The actual use of the land is driven mainly by the ability to access the land and markets as is obvious in the distribution of land cover as a function of distance to rivers and roads. When considering all rivers and roads a threshold distance at which disproportional agro-pastoral land cover switches from over represented to under represented is at about 1km. Best land use suggestions seem not to affect the choice of land use. Differences in abundance of land cover between watersheds are more prevailing than differences between colonist and indigenous groups.
Resumo:
AIRES, Kelson R. T. ; ARAÚJO, Hélder J. ; MEDEIROS, Adelardo A. D. . Plane Detection from Monocular Image Sequences. In: VISUALIZATION, IMAGING AND IMAGE PROCESSING, 2008, Palma de Mallorca, Spain. Proceedings..., Palma de Mallorca: VIIP, 2008