18 resultados para Non-rigid image alignment for handshape recognition

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The usage of digital content, such as video clips and images, has increased dramatically during the last decade. Local image features have been applied increasingly in various image and video retrieval applications. This thesis evaluates local features and applies them to image and video processing tasks. The results of the study show that 1) the performance of different local feature detector and descriptor methods vary significantly in object class matching, 2) local features can be applied in image alignment with superior results against the state-of-the-art, 3) the local feature based shot boundary detection method produces promising results, and 4) the local feature based hierarchical video summarization method shows promising new new research direction. In conclusion, this thesis presents the local features as a powerful tool in many applications and the imminent future work should concentrate on improving the quality of the local features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Feature extraction is the part of pattern recognition, where the sensor data is transformed into a more suitable form for the machine to interpret. The purpose of this step is also to reduce the amount of information passed to the next stages of the system, and to preserve the essential information in the view of discriminating the data into different classes. For instance, in the case of image analysis the actual image intensities are vulnerable to various environmental effects, such as lighting changes and the feature extraction can be used as means for detecting features, which are invariant to certain types of illumination changes. Finally, classification tries to make decisions based on the previously transformed data. The main focus of this thesis is on developing new methods for the embedded feature extraction based on local non-parametric image descriptors. Also, feature analysis is carried out for the selected image features. Low-level Local Binary Pattern (LBP) based features are in a main role in the analysis. In the embedded domain, the pattern recognition system must usually meet strict performance constraints, such as high speed, compact size and low power consumption. The characteristics of the final system can be seen as a trade-off between these metrics, which is largely affected by the decisions made during the implementation phase. The implementation alternatives of the LBP based feature extraction are explored in the embedded domain in the context of focal-plane vision processors. In particular, the thesis demonstrates the LBP extraction with MIPA4k massively parallel focal-plane processor IC. Also higher level processing is incorporated to this framework, by means of a framework for implementing a single chip face recognition system. Furthermore, a new method for determining optical flow based on LBPs, designed in particular to the embedded domain is presented. Inspired by some of the principles observed through the feature analysis of the Local Binary Patterns, an extension to the well known non-parametric rank transform is proposed, and its performance is evaluated in face recognition experiments with a standard dataset. Finally, an a priori model where the LBPs are seen as combinations of n-tuples is also presented

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monet teollisuuden konenäkö- ja hahmontunnistusongelmat ovat hyvin samantapaisia, jolloin prototyyppisovelluksia suunniteltaessa voitaisiin hyödyntää pitkälti samoja komponentteja. Oliopohjaiset sovelluskehykset tarjoavat erinomaisen tavan nopeuttaa ohjelmistokehitystä uudelleenkäytettävyyttä parantamalla. Näin voidaan sekä mahdollistaa konenäkösovellusten laajempi käyttö että säästää kustannuksissa. Tässä työssä esitellään konenäkösovelluskehys, joka on perusarkkitehtuuriltaan liukuhihnamainen. Ylätason rakenne koostuu sensorista, datankäsittelyoperaatioista, piirreirrottimesta sekä luokittimesta. Itse sovelluskehyksen lisäksi on toteutettu joukko kuvankäsittely- ja hahmontunnistusoperaatioita. Sovelluskehys nopeuttaa selvästi ohjelmointityötä ja helpottaa uusien kuvankäsittelyoperaatioiden lisää mistä.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Local features are used in many computer vision tasks including visual object categorization, content-based image retrieval and object recognition to mention a few. Local features are points, blobs or regions in images that are extracted using a local feature detector. To make use of extracted local features the localized interest points are described using a local feature descriptor. A descriptor histogram vector is a compact representation of an image and can be used for searching and matching images in databases. In this thesis the performance of local feature detectors and descriptors is evaluated for object class detection task. Features are extracted from image samples belonging to several object classes. Matching features are then searched using random image pairs of a same class. The goal of this thesis is to find out what are the best detector and descriptor methods for such task in terms of detector repeatability and descriptor matching rate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Currently, laser scribing is growing material processing method in the industry. Benefits of laser scribing technology are studied for example for improving an efficiency of solar cells. Due high-quality requirement of the fast scribing process, it is important to monitor the process in real time for detecting possible defects during the process. However, there is a lack of studies of laser scribing real time monitoring. Commonly used monitoring methods developed for other laser processes such a laser welding, are sufficient slow and existed applications cannot be implemented in fast laser scribing monitoring. The aim of this thesis is to find a method for laser scribing monitoring with a high-speed camera and evaluate reliability and performance of the developed monitoring system with experiments. The laser used in experiments is an IPG ytterbium pulsed fiber laser with 20 W maximum average power and Scan head optics used in the laser is Scanlab’s Hurryscan 14 II with an f100 tele-centric lens. The camera was connected to laser scanner using camera adapter to follow the laser process. A powerful fully programmable industrial computer was chosen for executing image processing and analysis. Algorithms for defect analysis, which are based on particle analysis, were developed using LabVIEW system design software. The performance of the algorithms was analyzed by analyzing a non-moving image from the scribing line with resolution 960x20 pixel. As a result, the maximum analysis speed was 560 frames per second. Reliability of the algorithm was evaluated by imaging scribing path with a variable number of defects 2000 mm/s when the laser was turned off and image analysis speed was 430 frames per second. The experiment was successful and as a result, the algorithms detected all defects from the scribing path. The final monitoring experiment was performed during a laser process. However, it was challenging to get active laser illumination work with the laser scanner due physical dimensions of the laser lens and the scanner. For reliable error detection, the illumination system is needed to be replaced.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The topic of this thesis is studying how lesions in retina caused by diabetic retinopathy can be detected from color fundus images by using machine vision methods. Methods for equalizing uneven illumination in fundus images, detecting regions of poor image quality due toinadequate illumination, and recognizing abnormal lesions were developed duringthe work. The developed methods exploit mainly the color information and simpleshape features to detect lesions. In addition, a graphical tool for collecting lesion data was developed. The tool was used by an ophthalmologist who marked lesions in the images to help method development and evaluation. The tool is a general purpose one, and thus it is possible to reuse the tool in similar projects.The developed methods were tested with a separate test set of 128 color fundus images. From test results it was calculated how accurately methods classify abnormal funduses as abnormal (sensitivity) and healthy funduses as normal (specificity). The sensitivity values were 92% for hemorrhages, 73% for red small dots (microaneurysms and small hemorrhages), and 77% for exudates (hard and soft exudates). The specificity values were 75% for hemorrhages, 70% for red small dots, and 50% for exudates. Thus, the developed methods detected hemorrhages accurately and microaneurysms and exudates moderately.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Perceiving the world visually is a basic act for humans, but for computers it is still an unsolved problem. The variability present innatural environments is an obstacle for effective computer vision. The goal of invariant object recognition is to recognise objects in a digital image despite variations in, for example, pose, lighting or occlusion. In this study, invariant object recognition is considered from the viewpoint of feature extraction. Thedifferences between local and global features are studied with emphasis on Hough transform and Gabor filtering based feature extraction. The methods are examined with respect to four capabilities: generality, invariance, stability, and efficiency. Invariant features are presented using both Hough transform and Gabor filtering. A modified Hough transform technique is also presented where the distortion tolerance is increased by incorporating local information. In addition, methods for decreasing the computational costs of the Hough transform employing parallel processing and local information are introduced.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis deals with distance transforms which are a fundamental issue in image processing and computer vision. In this thesis, two new distance transforms for gray level images are presented. As a new application for distance transforms, they are applied to gray level image compression. The new distance transforms are both new extensions of the well known distance transform algorithm developed by Rosenfeld, Pfaltz and Lay. With some modification their algorithm which calculates a distance transform on binary images with a chosen kernel has been made to calculate a chessboard like distance transform with integer numbers (DTOCS) and a real value distance transform (EDTOCS) on gray level images. Both distance transforms, the DTOCS and EDTOCS, require only two passes over the graylevel image and are extremely simple to implement. Only two image buffers are needed: The original gray level image and the binary image which defines the region(s) of calculation. No other image buffers are needed even if more than one iteration round is performed. For large neighborhoods and complicated images the two pass distance algorithm has to be applied to the image more than once, typically 3 10 times. Different types of kernels can be adopted. It is important to notice that no other existing transform calculates the same kind of distance map as the DTOCS. All the other gray weighted distance function, GRAYMAT etc. algorithms find the minimum path joining two points by the smallest sum of gray levels or weighting the distance values directly by the gray levels in some manner. The DTOCS does not weight them that way. The DTOCS gives a weighted version of the chessboard distance map. The weights are not constant, but gray value differences of the original image. The difference between the DTOCS map and other distance transforms for gray level images is shown. The difference between the DTOCS and EDTOCS is that the EDTOCS calculates these gray level differences in a different way. It propagates local Euclidean distances inside a kernel. Analytical derivations of some results concerning the DTOCS and the EDTOCS are presented. Commonly distance transforms are used for feature extraction in pattern recognition and learning. Their use in image compression is very rare. This thesis introduces a new application area for distance transforms. Three new image compression algorithms based on the DTOCS and one based on the EDTOCS are presented. Control points, i.e. points that are considered fundamental for the reconstruction of the image, are selected from the gray level image using the DTOCS and the EDTOCS. The first group of methods select the maximas of the distance image to new control points and the second group of methods compare the DTOCS distance to binary image chessboard distance. The effect of applying threshold masks of different sizes along the threshold boundaries is studied. The time complexity of the compression algorithms is analyzed both analytically and experimentally. It is shown that the time complexity of the algorithms is independent of the number of control points, i.e. the compression ratio. Also a new morphological image decompression scheme is presented, the 8 kernels' method. Several decompressed images are presented. The best results are obtained using the Delaunay triangulation. The obtained image quality equals that of the DCT images with a 4 x 4

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multispectral images are becoming more common in the field of remote sensing, computer vision, and industrial applications. Due to the high accuracy of the multispectral information, it can be used as an important quality factor in the inspection of industrial products. Recently, the development on multispectral imaging systems and the computational analysis on the multispectral images have been the focus of a growing interest. In this thesis, three areas of multispectral image analysis are considered. First, a method for analyzing multispectral textured images was developed. The method is based on a spectral cooccurrence matrix, which contains information of the joint distribution of spectral classes in a spectral domain. Next, a procedure for estimating the illumination spectrum of the color images was developed. Proposed method can be used, for example, in color constancy, color correction, and in the content based search from color image databases. Finally, color filters for the optical pattern recognition were designed, and a prototype of a spectral vision system was constructed. The spectral vision system can be used to acquire a low dimensional component image set for the two dimensional spectral image reconstruction. The data obtained by the spectral vision system is small and therefore convenient for storing and transmitting a spectral image.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The ability to recognize potential knowledge and convert it into business opportunities is one of the key factors of renewal in uncertain environments. This thesis examines absorptive capacity in the context of non-research and development innovation, with a primary focus on the social interaction that facilitates the absorption of knowledge. It proposes that everyone is and should be entitled to take part in the social interaction that shapes individual observations into innovations. Both innovation and absorptive capacity have been traditionally related to research and development departments and institutions. These innovations need to be adopted and adapted by others. This so-called waterfall model of innovations is only one aspect of new knowledge generation and innovation. In addition to this Science–Technology–Innovation perspective, more attention has been recently paid to the Doing–Using–Interacting mode of generating new knowledge and innovations. The amount of literature on absorptive capacity is vast, yet the concept is reified. The greater part of the literature links absorptive capacity to research and development departments. Some publications have focused on the nature of absorptive capacity in practice and the role of social interaction in enhancing it. Recent literature on absorptive capacity calls for studies that shed light on the relationship between individual absorptive capacity and organisational absorptive capacity. There has also been a call to examine absorptive capacity in non-research and development environments. Drawing on the literature on employee-driven innovation and social capital, this thesis looks at how individual observations and ideas are converted into something that an organisation can use. The critical phases of absorptive capacity, during which the ideas of individuals are incorporated into a group context, are assimilation and transformation. These two phases are seen as complementary: whereas assimilation is the application of easy-to-accept knowledge, transformation challenges the current way of thinking. The two require distinct kinds of social interaction and practices. The results of this study can been crystallised thus: “Enhancing absorptive capacity in practicebased non-research and development context is to organise the optimal circumstances for social interaction. Every individual is a potential source of signals leading to innovations. The individual, thus, recognises opportunities and acquires signals. Through the social interaction processes of assimilation and transformation, these signals are processed into the organisation’s reality and language. The conditions of creative social capital facilitate the interplay between assimilation and transformation. An organisation that strives for employee-driven innovation gains the benefits of a broader surface for opportunity recognition and faster absorption.” If organisations and managers become more aware of the benefits of enhancing absorptive capacity in practice, they have reason to assign resources to those practices that facilitate the creation of absorptive capacity. By recognising the underlying social mechanisms and structural features that lead either to assimilation or transformation, it is easier to balance between renewal and effective operations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

During a possible loss of coolant accident in BWRs, a large amount of steam will be released from the reactor pressure vessel to the suppression pool. Steam will be condensed into the suppression pool causing dynamic and structural loads to the pool. The formation and break up of bubbles can be measured by visual observation using a suitable pattern recognition algorithm. The aim of this study was to improve the preliminary pattern recognition algorithm, developed by Vesa Tanskanen in his doctoral dissertation, by using MATLAB. Video material from the PPOOLEX test facility, recorded during thermal stratification and mixing experiments, was used as a reference in the development of the algorithm. The developed algorithm consists of two parts: the pattern recognition of the bubbles and the analysis of recognized bubble images. The bubble recognition works well, but some errors will appear due to the complex structure of the pool. The results of the image analysis were reasonable. The volume and the surface area of the bubbles were not evaluated. Chugging frequencies calculated by using FFT fitted well into the results of oscillation frequencies measured in the experiments. The pattern recognition algorithm works in the conditions it is designed for. If the measurement configuration will be changed, some modifications have to be done. Numerous improvements are proposed for the future 3D equipment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work, image based estimation methods, also known as direct methods, are studied which avoid feature extraction and matching completely. Cost functions use raw pixels as measurements and the goal is to produce precise 3D pose and structure estimates. The cost functions presented minimize the sensor error, because measurements are not transformed or modified. In photometric camera pose estimation, 3D rotation and translation parameters are estimated by minimizing a sequence of image based cost functions, which are non-linear due to perspective projection and lens distortion. In image based structure refinement, on the other hand, 3D structure is refined using a number of additional views and an image based cost metric. Image based estimation methods are particularly useful in conditions where the Lambertian assumption holds, and the 3D points have constant color despite viewing angle. The goal is to improve image based estimation methods, and to produce computationally efficient methods which can be accomodated into real-time applications. The developed image-based 3D pose and structure estimation methods are finally demonstrated in practise in indoor 3D reconstruction use, and in a live augmented reality application.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The recent emergence of low-cost RGB-D sensors has brought new opportunities for robotics by providing affordable devices that can provide synchronized images with both color and depth information. In this thesis, recent work on pose estimation utilizing RGBD sensors is reviewed. Also, a pose recognition system for rigid objects using RGB-D data is implemented. The implementation uses half-edge primitives extracted from the RGB-D images for pose estimation. The system is based on the probabilistic object representation framework by Detry et al., which utilizes Nonparametric Belief Propagation for pose inference. Experiments are performed on household objects to evaluate the performance and robustness of the system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The future of privacy in the information age is a highly debated topic. In particular, new and emerging technologies such as ICTs and cognitive technologies are seen as threats to privacy. This thesis explores images of the future of privacy among non-experts within the time frame from the present until the year 2050. The aims of the study are to conceptualise privacy as a social and dynamic phenomenon, to understand how privacy is conceptualised among citizens and to analyse ideal-typical images of the future of privacy using the causal layered analysis method. The theoretical background of the thesis combines critical futures studies and critical realism, and the empirical material is drawn from three focus group sessions held in spring 2012 as part of the PRACTIS project. From a critical realist perspective, privacy is conceptualised as a social institution which creates and maintains boundaries between normative circles and preserves the social freedom of individuals. Privacy changes when actors with particular interests engage in technology-enabled practices which challenge current privacy norms. The thesis adopts a position of technological realism as opposed to determinism or neutralism. In the empirical part, the focus group participants are divided into four clusters based on differences in privacy conceptions and perceived threats and solutions. The clusters are fundamentalists, pragmatists, individualists and collectivists. Correspondingly, four ideal-typical images of the future are composed: ‘drift to low privacy’, ‘continuity and benign evolution’, ‘privatised privacy and an uncertain future’, and ‘responsible future or moral decline’. The images are analysed using the four layers of causal layered analysis: litany, system, worldview and myth. Each image has its strengths and weaknesses. The individualistic images tend to be fatalistic in character while the collectivistic images are somewhat utopian. In addition, the images have two common weaknesses: lack of recognition of ongoing developments and simplistic conceptions of privacy based on a dichotomy between the individual and society. The thesis argues for a dialectical understanding of futures as present images of the future and as outcomes of real processes and mechanisms. The first steps in promoting desirable futures are the awareness of privacy as a social institution, the awareness of current images of the future, including their assumptions and weaknesses, and an attitude of responsibility where futures are seen as the consequences of present choices.