19 resultados para Object manipulation
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Perceiving the world visually is a basic act for humans, but for computers it is still an unsolved problem. The variability present innatural environments is an obstacle for effective computer vision. The goal of invariant object recognition is to recognise objects in a digital image despite variations in, for example, pose, lighting or occlusion. In this study, invariant object recognition is considered from the viewpoint of feature extraction. Thedifferences between local and global features are studied with emphasis on Hough transform and Gabor filtering based feature extraction. The methods are examined with respect to four capabilities: generality, invariance, stability, and efficiency. Invariant features are presented using both Hough transform and Gabor filtering. A modified Hough transform technique is also presented where the distortion tolerance is increased by incorporating local information. In addition, methods for decreasing the computational costs of the Hough transform employing parallel processing and local information are introduced.
Resumo:
Tässä insinöörityössä esitellään Stadian verkkoviestinnän VIDEOS-hankkeeseen liittyvän web-pohjaisen videoeditorin kehitys ja käytetyt teknologiat. Fooga-nimiseksi nimetty videoeditorin käyttämät tekniikat ovat Ruby, Ruby on Rails, FFmpeg, Mencoder, ImageMagick ja FLVTool2. Ruby on olio-pohjainen skriptikieli, Ruby on Rails on websovelluskehys ja muut tekniikat ovat komentorivipohjaisia työkaluja, jotka tarjoavat tärkeimmät toiminnallisuudet Foogalle. Tavoitteina oli tämän työn yhteydessä ohjelmoida Foogaan perustoiminnallisuudet, jotka mahdollistavat minimaaliset käyttömahdollisuudet kevääseen 2007 mennessä. Kehitystyö jatkuu vuoteen 2009 asti tarjoamalla samalla mahdollisuuden usealle insinöörityölle tekniikan ja liikenteen koulutusohjelmasta. Tämän lisäksi tässä insinöörityössä perehdytään Object-Relational Mapping-tekniikan perusteisiiin ja verrataan Ruby on Railsin ja Javan ORM-ominaisuuksia. Ruby on Railsin osalta esitellään ActiveRecord-luokka ja Javan osalta Hibernate, jonka johdantona on DAO/DTO-sunnittelumalli.
Resumo:
Tässä työssä on esitetty sen ohjelmiston kehittämisen prosessi, joka on tarkoitettu annettavien palveluiden valvottavaksi käyttäen prototyyppimallia. Raportti sisältää vaatimusten, kohteisiin suunnatun analyysin ja suunnittelun, realisointiprosessien kuvauksen ja prototyypin testauksen. Ohjelmiston käyttöala – antavien palveluiden valvonta. Vaatimukset sovellukselle analysoitiin ohjelmistomarkkinoiden perusteella sekä ohjelmiston engineeringin periaatteiden mukaisesti. Ohjelmiston prototyyppi on realisoitu käyttäen asiakas-/palvelinhybridimallia sekä ralaatiokantaa. Kehitetty ohjelmisto on tarkoitettu venäläisille tietokonekerhoille, jotka erikoistuvat pelipalvelinten antamiseen.
Resumo:
Tämä diplomityökuuluu tietoliikenneverkkojen suunnittelun tutkimukseen ja pohjimmiltaan kohdistuu verkon mallintamiseen. Tietoliikenneverkkojen suunnittelu on monimutkainen ja vaativa ongelma, joka sisältää mutkikkaita ja aikaa vieviä tehtäviä. Tämä diplomityö esittelee ”monikerroksisen verkkomallin”, jonka tarkoitus on auttaa verkon suunnittelijoita selviytymään ongelmien monimutkaisuudesta ja vähentää verkkojen suunnitteluun kuluvaa aikaa. Monikerroksinen verkkomalli perustuu yleisille objekteille, jotka ovat yhteisiä kaikille tietoliikenneverkoille. Tämä tekee mallista soveltuvan mielivaltaisille verkoille, välittämättä verkkokohtaisista ominaisuuksista tai verkon toteutuksessa käytetyistä teknologioista. Malli määrittelee tarkan terminologian ja käyttää kolmea käsitettä: verkon jakaminen tasoihin (plane separation), kerrosten muodostaminen (layering) ja osittaminen (partitioning). Nämä käsitteet kuvataan yksityiskohtaisesti tässä työssä. Monikerroksisen verkkomallin sisäinen rakenne ja toiminnallisuus ovat määritelty käyttäen Unified Modelling Language (UML) -notaatiota. Tämä työ esittelee mallin use case- , paketti- ja luokkakaaviot. Diplomityö esittelee myös tulokset, jotka on saatu vertailemalla monikerroksista verkkomallia muihin verkkomalleihin. Tulokset osoittavat, että monikerroksisella verkkomallilla on etuja muihin malleihin verrattuna.
Resumo:
The number of digital images has been increasing exponentially in the last few years. People have problems managing their image collections and finding a specific image. An automatic image categorization system could help them to manage images and find specific images. In this thesis, an unsupervised visual object categorization system was implemented to categorize a set of unknown images. The system is unsupervised, and hence, it does not need known images to train the system which needs to be manually obtained. Therefore, the number of possible categories and images can be huge. The system implemented in the thesis extracts local features from the images. These local features are used to build a codebook. The local features and the codebook are then used to generate a feature vector for an image. Images are categorized based on the feature vectors. The system is able to categorize any given set of images based on the visual appearance of the images. Images that have similar image regions are grouped together in the same category. Thus, for example, images which contain cars are assigned to the same cluster. The unsupervised visual object categorization system can be used in many situations, e.g., in an Internet search engine. The system can categorize images for a user, and the user can then easily find a specific type of image.
Resumo:
Sensor-based robot control allows manipulation in dynamic environments with uncertainties. Vision is a versatile low-cost sensory modality, but low sample rate, high sensor delay and uncertain measurements limit its usability, especially in strongly dynamic environments. Force is a complementary sensory modality allowing accurate measurements of local object shape when a tooltip is in contact with the object. In multimodal sensor fusion, several sensors measuring different modalities are combined to give a more accurate estimate of the environment. As force and vision are fundamentally different sensory modalities not sharing a common representation, combining the information from these sensors is not straightforward. In this thesis, methods for fusing proprioception, force and vision together are proposed. Making assumptions of object shape and modeling the uncertainties of the sensors, the measurements can be fused together in an extended Kalman filter. The fusion of force and visual measurements makes it possible to estimate the pose of a moving target with an end-effector mounted moving camera at high rate and accuracy. The proposed approach takes the latency of the vision system into account explicitly, to provide high sample rate estimates. The estimates also allow a smooth transition from vision-based motion control to force control. The velocity of the end-effector can be controlled by estimating the distance to the target by vision and determining the velocity profile giving rapid approach and minimal force overshoot. Experiments with a 5-degree-of-freedom parallel hydraulic manipulator and a 6-degree-of-freedom serial manipulator show that integration of several sensor modalities can increase the accuracy of the measurements significantly.
Resumo:
This study presents the information required to describe the machine and device resources in the turret punch press environment which are needed for the development of the analysing method for automated production. The description of product and device resources and their interconnectedness is the starting point for method comparison the development of expenses, production planning and the performance of optimisation. The manufacturing method cannot be optimized unless the variables and their interdependence are known. Sheet metal parts in particular may then become remarkably complex, and their automatic manufacture may be difficult or, with some automatic equipment, even impossible if not know manufacturing properties. This thesis consists of three main elements, which constitute the triangulation. In the first phase of triangulation, the manufacture occuring on a turret punch press is examined in order to find the factors that affect the efficiency of production. In the second phase of triangulation, the manufacturability of products on turret punch presses is examined through a set of laboratory tests. The third phase oftriangulation involves an examination of five industry parts. The main key findings of this study are: all possible efficiency in high automation level machining cannot be achieved unless the raw materials used in production and the dependencies of the machine and tools are well known. Machine-specific manufacturability factors for turret punch presses were not taken into account in the industrial case samples. On the grounds of the performed tests and industrial case samples, the designer of a sheet metal product can directly influence the machining time, material loss, energy consumption and the number of tools required on a turret punch press by making decisions in the way presented in the hypothesis of thisstudy. The sheet metal parts to be produced can be optimised to bemanufactured on a turret punch press when the material to be used and the kinds of machine and tool options available are known. This provides in-depth knowledge of the machine and tool properties machine and tool-specifically. None of the optimisation starting points described here is a separate entity; instead, they are all connected to each other.
Resumo:
Local features are used in many computer vision tasks including visual object categorization, content-based image retrieval and object recognition to mention a few. Local features are points, blobs or regions in images that are extracted using a local feature detector. To make use of extracted local features the localized interest points are described using a local feature descriptor. A descriptor histogram vector is a compact representation of an image and can be used for searching and matching images in databases. In this thesis the performance of local feature detectors and descriptors is evaluated for object class detection task. Features are extracted from image samples belonging to several object classes. Matching features are then searched using random image pairs of a same class. The goal of this thesis is to find out what are the best detector and descriptor methods for such task in terms of detector repeatability and descriptor matching rate.
Resumo:
The main focus of the present thesis was at verbal episodic memory processes that are particularly vulnerable to preclinical and clinical Alzheimer’s disease (AD). Here these processes were studied by a word learning paradigm, cutting across the domains of memory and language learning studies. Moreover, the differentiation between normal aging, mild cognitive impairment (MCI) and AD was studied by the cognitive screening test CERAD. In study I, the aim was to examine how patients with amnestic MCI differ from healthy controls in the different CERAD subtests. Also, the sensitivity and specificity of the CERAD screening test to MCI and AD was examined, as previous studies on the sensitivity and specificity of the CERAD have not included MCI patients. The results indicated that MCI is characterized by an encoding deficit, as shown by the overall worse performance on the CERAD Wordlist learning test compared with controls. As a screening test, CERAD was not very sensitive to MCI. In study II, verbal learning and forgetting in amnestic MCI, AD and healthy elderly controls was investigated with an experimental word learning paradigm, where names of 40 unfamiliar objects (mainly archaic tools) were trained with or without semantic support. The object names were trained during a 4-day long period and a follow-up was conducted one week, 4 weeks and 8 weeks after the training period. Manipulation of semantic support was included in the paradigm because it was hypothesized that semantic support might have some beneficial effects in the present learning task especially for the MCI group, as semantic memory is quite well preserved in MCI in contrast to episodic memory. We found that word learning was significantly impaired in MCI and AD patients, whereas forgetting patterns were similar across groups. Semantic support showed a beneficial effect on object name retrieval in the MCI group 8 weeks after training, indicating that the MCI patients’ preserved semantic memory abilities compensated for their impaired episodic memory. The MCI group performed equally well as the controls in the tasks tapping incidental learning and recognition memory, whereas the AD group showed impairment. Both the MCI and the AD group benefited less from phonological cueing than the controls. Our findings indicate that acquisition is compromised in both MCI and AD, whereas long13 term retention is not affected to the same extent. Incidental learning and recognition memory seem to be well preserved in MCI. In studies III and IV, the neural correlates of naming newly learned objects were examined in healthy elderly subjects and in amnestic MCI patients by means of positron emission tomography (PET) right after the training period. The naming of newly learned objects by healthy elderly subjects recruited a left-lateralized network, including frontotemporal regions and the cerebellum, which was more extensive than the one related to the naming of familiar objects (study III). Semantic support showed no effects on the PET results for the healthy subjects. The observed activation increases may reflect lexicalsemantic and lexical-phonological retrieval, as well as more general associative memory mechanisms. In study IV, compared to the controls, the MCI patients showed increased anterior cingulate activation when naming newly learned objects that had been learned without semantic support. This suggests a recruitment of additional executive and attentional resources in the MCI group.
Resumo:
The large and growing number of digital images is making manual image search laborious. Only a fraction of the images contain metadata that can be used to search for a particular type of image. Thus, the main research question of this thesis is whether it is possible to learn visual object categories directly from images. Computers process images as long lists of pixels that do not have a clear connection to high-level semantics which could be used in the image search. There are various methods introduced in the literature to extract low-level image features and also approaches to connect these low-level features with high-level semantics. One of these approaches is called Bag-of-Features which is studied in the thesis. In the Bag-of-Features approach, the images are described using a visual codebook. The codebook is built from the descriptions of the image patches using clustering. The images are described by matching descriptions of image patches with the visual codebook and computing the number of matches for each code. In this thesis, unsupervised visual object categorisation using the Bag-of-Features approach is studied. The goal is to find groups of similar images, e.g., images that contain an object from the same category. The standard Bag-of-Features approach is improved by using spatial information and visual saliency. It was found that the performance of the visual object categorisation can be improved by using spatial information of local features to verify the matches. However, this process is computationally heavy, and thus, the number of images must be limited in the spatial matching, for example, by using the Bag-of-Features method as in this study. Different approaches for saliency detection are studied and a new method based on the Hessian-Affine local feature detector is proposed. The new method achieves comparable results with current state-of-the-art. The visual object categorisation performance was improved by using foreground segmentation based on saliency information, especially when the background could be considered as clutter.
Resumo:
Robotic grasping has been studied increasingly for a few decades. While progress has been made in this field, robotic hands are still nowhere near the capability of human hands. However, in the past few years, the increase in computational power and the availability of commercial tactile sensors have made it easier to develop techniques that exploit the feedback from the hand itself, the sense of touch. The focus of this thesis lies in the use of this sense. The work described in this thesis focuses on robotic grasping from two different viewpoints: robotic systems and data-driven grasping. The robotic systems viewpoint describes a complete architecture for the act of grasping and, to a lesser extent, more general manipulation. Two central claims that the architecture was designed for are hardware independence and the use of sensors during grasping. These properties enables the use of multiple different robotic platforms within the architecture. Secondly, new data-driven methods are proposed that can be incorporated into the grasping process. The first of these methods is a novel way of learning grasp stability from the tactile and haptic feedback of the hand instead of analytically solving the stability from a set of known contacts between the hand and the object. By learning from the data directly, there is no need to know the properties of the hand, such as kinematics, enabling the method to be utilized with complex hands. The second novel method, probabilistic grasping, combines the fields of tactile exploration and grasp planning. By employing well-known statistical methods and pre-existing knowledge of an object, object properties, such as pose, can be inferred with related uncertainty. This uncertainty is utilized by a grasp planning process which plans for stable grasps under the inferred uncertainty.
Resumo:
Object-oriented programming is a widely adopted paradigm for desktop software development. This paradigm partitions software into separate entities, objects, which consist of data and related procedures used to modify and inspect it. The paradigm has evolved during the last few decades to emphasize decoupling between object implementations, via means such as explicit interface inheritance and event-based implicit invocation. Inter-process communication (IPC) technologies allow applications to interact with each other. This enables making software distributed across multiple processes, resulting in a modular architecture with benefits in resource sharing, robustness, code reuse and security. The support for object-oriented programming concepts varies between IPC systems. This thesis is focused on the D-Bus system, which has recently gained a lot of users, but is still scantily researched. D-Bus has support for asynchronous remote procedure calls with return values and a content-based publish/subscribe event delivery mechanism. In this thesis, several patterns for method invocation in D-Bus and similar systems are compared. The patterns that simulate synchronous local calls are shown to be dangerous. Later, we present a state-caching proxy construct, which avoids the complexity of properly asynchronous calls for object inspection. The proxy and certain supplementary constructs are presented conceptually as generic object-oriented design patterns. The e ect of these patterns on non-functional qualities of software, such as complexity, performance and power consumption, is reasoned about based on the properties of the D-Bus system. The use of the patterns reduces complexity, but maintains the other qualities at a good level. Finally, we present currently existing means of specifying D-Bus object interfaces for the purposes of code and documentation generation. The interface description language used by the Telepathy modular IM/VoIP framework is found to be an useful extension of the basic D-Bus introspection format.
Resumo:
The usage of digital content, such as video clips and images, has increased dramatically during the last decade. Local image features have been applied increasingly in various image and video retrieval applications. This thesis evaluates local features and applies them to image and video processing tasks. The results of the study show that 1) the performance of different local feature detector and descriptor methods vary significantly in object class matching, 2) local features can be applied in image alignment with superior results against the state-of-the-art, 3) the local feature based shot boundary detection method produces promising results, and 4) the local feature based hierarchical video summarization method shows promising new new research direction. In conclusion, this thesis presents the local features as a powerful tool in many applications and the imminent future work should concentrate on improving the quality of the local features.