46 resultados para Visual Tracking
Resumo:
The number of digital images has been increasing exponentially in the last few years. People have problems managing their image collections and finding a specific image. An automatic image categorization system could help them to manage images and find specific images. In this thesis, an unsupervised visual object categorization system was implemented to categorize a set of unknown images. The system is unsupervised, and hence, it does not need known images to train the system which needs to be manually obtained. Therefore, the number of possible categories and images can be huge. The system implemented in the thesis extracts local features from the images. These local features are used to build a codebook. The local features and the codebook are then used to generate a feature vector for an image. Images are categorized based on the feature vectors. The system is able to categorize any given set of images based on the visual appearance of the images. Images that have similar image regions are grouped together in the same category. Thus, for example, images which contain cars are assigned to the same cluster. The unsupervised visual object categorization system can be used in many situations, e.g., in an Internet search engine. The system can categorize images for a user, and the user can then easily find a specific type of image.
Resumo:
This thesis presents two graphical user interfaces for the project DigiQ - Fusion of Digital and Visual Print Quality, a project for computationally modeling the subjective human experience of print quality by measuring the image with certain metrics. After presenting the user interfaces, methods for reducing the computation time of several of the metrics and the image registration process required to compute the metrics, and details of their performance are given. The weighted sample method for the image registration process was able to signifigantly decrease the calculation times while resulting in some error. The random sampling method for the metrics greatly reduced calculation time while maintaining excellent accuracy, but worked with only two of the metrics.
Resumo:
The problem of understanding how humans perceive the quality of a reproduced image is of interest to researchers of many fields related to vision science and engineering: optics and material physics, image processing (compression and transfer), printing and media technology, and psychology. A measure for visual quality cannot be defined without ambiguity because it is ultimately the subjective opinion of an “end-user” observing the product. The purpose of this thesis is to devise computational methods to estimate the overall visual quality of prints, i.e. a numerical value that combines all the relevant attributes of the perceived image quality. The problem is limited to consider the perceived quality of printed photographs from the viewpoint of a consumer, and moreover, the study focuses only on digital printing methods, such as inkjet and electrophotography. The main contributions of this thesis are two novel methods to estimate the overall visual quality of prints. In the first method, the quality is computed as a visible difference between the reproduced image and the original digital (reference) image, which is assumed to have an ideal quality. The second method utilises instrumental print quality measures, such as colour densities, measured from printed technical test fields, and connects the instrumental measures to the overall quality via subjective attributes, i.e. attributes that directly contribute to the perceived quality, using a Bayesian network. Both approaches were evaluated and verified with real data, and shown to predict well the subjective evaluation results.
Resumo:
Local features are used in many computer vision tasks including visual object categorization, content-based image retrieval and object recognition to mention a few. Local features are points, blobs or regions in images that are extracted using a local feature detector. To make use of extracted local features the localized interest points are described using a local feature descriptor. A descriptor histogram vector is a compact representation of an image and can be used for searching and matching images in databases. In this thesis the performance of local feature detectors and descriptors is evaluated for object class detection task. Features are extracted from image samples belonging to several object classes. Matching features are then searched using random image pairs of a same class. The goal of this thesis is to find out what are the best detector and descriptor methods for such task in terms of detector repeatability and descriptor matching rate.
Resumo:
The present dissertation examined reading development during elementary school years by means of eye movement tracking. Three different but related issues in this field were assessed. First of all, the development of parafoveal processing skills in reading was investigated. Second, it was assessed whether and to what extent sublexical units such as syllables and morphemes are used in processing Finnish words and whether the use of these sublexical units changes as a function of reading proficiency. Finally, the developmental trend in the speed of visual information extraction during reading was examined. With regard to parafoveal processing skills, it was shown that 2nd graders extract letter identity information approx. 5 characters to the right of fixation, 4th graders approx. 7 characters to the right of fixation, and 6th graders and adults approx. 9 characters to the right of fixation. Furthermore, it was shown that all age groups extract more parafoveal information within compound words than across adjectivenoun pairs of similar length. In compounds, parafoveal word information can be extracted in parallel with foveal word information, if the compound in question is of high frequency. With regard to the use of sublexical units in Finnish word processing, it was shown that less proficient 2nd graders use both syllables and morphemes in the course of lexical access. More proficient 2nd graders as well as older readers seem to process words more holistically. Finally, it was shown that 60 ms is enough for 4th graders and adults to extract visual information from both 4-letter and 8-letter words, whereas 2nd graders clearly needed more than 60 ms to extract all information from 8- letter words for processing to proceed smoothly. The present dissertation demonstrates that Finnish 2nd graders develop their reading skills rapidly and are already at an adult level in some aspects of reading. This is not to say that there are no differences between less proficient (e.g., 2nd graders) and more proficient readers (e.g., adults) but in some respects it seems that the visual system used in extracting information from the text is matured by the 2nd grade. Furthermore, the present dissertation demonstrates that the allocation of attention in reading depends much on textual properties such as word frequency and whether words are spatially unified (as in compounds) or not. This flexibility of the attentional system naturally needs to be captured in word processing models. Finally, individual differences within age groups are quite substantial but it seems that by the end of the 2nd grade practically all Finnish children have reached a reasonable level of reading proficiency.
Resumo:
Since the introduction of automatic orbital welding in pipeline application in 1961, significant improvements have been obtained in orbital pipe welding systems. Requirement of more productive welding systems for pipeline application forces manufacturers to innovate new advanced systems and welding processes for orbital welding method. Various methods have been used to make welding process adaptive, such as visual sensing, passive visual sensing, real-time intelligent control, scan welding technique, multi laser vision sensor, thermal scanning, adaptive image processing, neural network model, machine vision, and optical sensing. Numerous studies are reviewed and discussed in this Master’s thesis and based on a wide range of experiments which already have been accomplished by different researches the vision sensor are reported to be the best choice for adaptive orbital pipe welding system. Also, in this study the most welding processes as well as the most pipe variations welded by orbital welding systems mainly for oil and gas pipeline applications are explained. The welding results show that Gas Metal Arc Welding (GMAW) and its variants like Surface Tension Transfer (STT) and modified short circuit are the most preferred processes in the welding of root pass and can be replaced to the Gas Tungsten Arc Welding (GTAW) in many applications. Furthermore, dual-tandem gas metal arc welding technique is currently considered the most efficient method in the welding of fill pass. Orbital GTAW process mostly is applied for applications ranging from single run welding of thin walled stainless tubes to multi run welding of thick walled pipes. Flux cored arc welding process is faster process with higher deposition rate and recently this process is getting more popular in pipe welding applications. Also, combination of gas metal arc welding and Nd:YAG laser has shown acceptable results in girth welding of land pipelines for oil and gas industry. This Master’s thesis can be implemented as a guideline in welding of pipes and tubes to achieve higher quality and efficiency. Also, this research can be used as a base material for future investigations to supplement present finding.
Resumo:
The large and growing number of digital images is making manual image search laborious. Only a fraction of the images contain metadata that can be used to search for a particular type of image. Thus, the main research question of this thesis is whether it is possible to learn visual object categories directly from images. Computers process images as long lists of pixels that do not have a clear connection to high-level semantics which could be used in the image search. There are various methods introduced in the literature to extract low-level image features and also approaches to connect these low-level features with high-level semantics. One of these approaches is called Bag-of-Features which is studied in the thesis. In the Bag-of-Features approach, the images are described using a visual codebook. The codebook is built from the descriptions of the image patches using clustering. The images are described by matching descriptions of image patches with the visual codebook and computing the number of matches for each code. In this thesis, unsupervised visual object categorisation using the Bag-of-Features approach is studied. The goal is to find groups of similar images, e.g., images that contain an object from the same category. The standard Bag-of-Features approach is improved by using spatial information and visual saliency. It was found that the performance of the visual object categorisation can be improved by using spatial information of local features to verify the matches. However, this process is computationally heavy, and thus, the number of images must be limited in the spatial matching, for example, by using the Bag-of-Features method as in this study. Different approaches for saliency detection are studied and a new method based on the Hessian-Affine local feature detector is proposed. The new method achieves comparable results with current state-of-the-art. The visual object categorisation performance was improved by using foreground segmentation based on saliency information, especially when the background could be considered as clutter.
Resumo:
This dissertation examined skill development in music reading by focusing on the visual processing of music notation in different music-reading tasks. Each of the three experiments of this dissertation addressed one of the three types of music reading: (i) sight-reading, i.e. reading and performing completely unknown music, (ii) rehearsed reading, during which the performer is already familiar with the music being played, and (iii) silent reading with no performance requirements. The use of the eye-tracking methodology allowed the recording of the readers’ eye movements from the time of music reading with extreme precision. Due to the lack of coherence in the smallish amount of prior studies on eye movements in music reading, the dissertation also had a heavy methodological emphasis. The present dissertation thus aimed to promote two major issues: (1) it investigated the eye-movement indicators of skill and skill development in sight-reading, rehearsed reading and silent reading, and (2) developed and tested suitable methods that can be used by future studies on the topic. Experiment I focused on the eye-movement behaviour of adults during their first steps of learning to read music notation. The longitudinal experiment spanned a nine-month long music-training period, during which 49 participants (university students taking part in a compulsory music course) sight-read and performed a series of simple melodies in three measurement sessions. Participants with no musical background were entitled as “novices”, whereas “amateurs” had had musical training prior to the experiment. The main issue of interest was the changes in the novices’ eye movements and performances across the measurements while the amateurs offered a point of reference for the assessment of the novices’ development. The experiment showed that the novices tended to sight-read in a more stepwise fashion than the amateurs, the latter group manifesting more back-and-forth eye movements. The novices’ skill development was reflected by the faster identification of note symbols involved in larger melodic intervals. Across the measurements, the novices also began to show sensitivity to the melodies’ metrical structure, which the amateurs demonstrated from the very beginning. The stimulus melodies consisted of quarter notes, making the effects of meter and larger melodic intervals distinguishable from effects caused by, say, different rhythmic patterns. Experiment II explored the eye movements of 40 experienced musicians (music education students and music performance students) during temporally controlled rehearsed reading. This cross-sectional experiment focused on the eye-movement effects of one-bar-long melodic alterations placed within a familiar melody. The synchronizing of the performance and eye-movement recordings enabled the investigation of the eye-hand span, i.e., the temporal gap between a performed note and the point of gaze. The eye-hand span was typically found to remain around one second. Music performance students demonstrated increased professing efficiency by their shorter average fixation durations as well as in the two examined eye-hand span measures: these participants used larger eye-hand spans more frequently and inspected more of the musical score during the performance of one metrical beat than students of music education. Although all participants produced performances almost indistinguishable in terms of their auditory characteristics, the altered bars indeed affected the reading of the score: the general effects of expertise in terms of the two eye- hand span measures, demonstrated by the music performance students, disappeared in the face of the melodic alterations. Experiment III was a longitudinal experiment designed to examine the differences between adult novice and amateur musicians’ silent reading of music notation, as well as the changes the 49 participants manifested during a nine-month long music course. From a methodological perspective, an opening to research on eye movements in music reading was the inclusion of a verbal protocol in the research design: after viewing the musical image, the readers were asked to describe what they had seen. A two-way categorization for verbal descriptions was developed in order to assess the quality of extracted musical information. More extensive musical background was related to shorter average fixation duration, more linear scanning of the musical image, and more sophisticated verbal descriptions of the music in question. No apparent effects of skill development were observed for the novice music readers alone, but all participants improved their verbal descriptions towards the last measurement. Apart from the background-related differences between groups of participants, combining verbal and eye-movement data in a cluster analysis identified three styles of silent reading. The finding demonstrated individual differences in how the freely defined silent-reading task was approached. This dissertation is among the first presentations of a series of experiments systematically addressing the visual processing of music notation in various types of music-reading tasks and focusing especially on the eye-movement indicators of developing music-reading skill. Overall, the experiments demonstrate that the music-reading processes are affected not only by “top-down” factors, such as musical background, but also by the “bottom-up” effects of specific features of music notation, such as pitch heights, metrical division, rhythmic patterns and unexpected melodic events. From a methodological perspective, the experiments emphasize the importance of systematic stimulus design, temporal control during performance tasks, and the development of complementary methods, for easing the interpretation of the eye-movement data. To conclude, this dissertation suggests that advances in comprehending the cognitive aspects of music reading, the nature of expertise in this musical task, and the development of educational tools can be attained through the systematic application of the eye-tracking methodology also in this specific domain.
Resumo:
This thesis was part of lean adaptation project started at Outotec Lappeenranta factory in early 2013. The purpose of this thesis was to develop and propose lean tools that could be used in daily management, visual management and continuous improvement. This thesis was “outsiders” view, and as such, did not study the current processes deeply. As result of this thesis, two different Daily Management -boards were designed, one for parallel processes and one for sequential processes. In addition, methods of doing continuous improvement and daily task accountability were framed and standard work for the leaders outlined. The tools presented in this thesis are general tools which support work in lean environment. They are visual and, if used correctly, they provide a basis from which continuous improvement can be done. Lean philosophy emphasizes the deep understanding of the current situation and it would be against the lean principles to blindly implement anything developed “on the outside”. The tools presented should be reviewed and modified further by the people working on the factory floor.
Resumo:
Ett ämne som väckt intresse både inom industrin och forskningen är hantering av kundförhållanden (CRM, eng. Customer Relationship Management), dvs. en kundorienterad affärsstrategi där företagen från att ha varit produktorienterade väljer att bli mera kundcentrerade. Numera kan kundernas beteende och aktiviteter lätt registreras och sparas med hjälp av integrerade affärssystem (ERP, eng. Enterprise Resource Planning) och datalager (DW, eng. Data Warehousing). Kunder med olika preferenser och köpbeteende skapar sin egen ”signatur” i synnerhet via användningen av kundkort, vilket möjliggör mångsidig modellering av kundernas köpbeteende. För att få en översikt av kundernas köpbeteende och deras lönsamhet, används ofta kundsegmentering som en metod för att indela kunderna i grupper utgående från deras likheter. De mest använda metoderna för kundsegmentering är analytiska modeller konstruerade för en viss tidsperiod. Dessa modeller beaktar inte att kundernas beteende kan förändras med tiden. I föreliggande avhandling skapas en holistisk översikt av kundernas karaktär och köpbeteende som utöver de konventionella segmenteringsmodellerna även beaktar dynamiken i köpbeteendet. Dynamiken i en kundsegmenteringsmodell innefattar förändringar i segmentens struktur och innehåll, samt förändringen av individuella kunders tillhörighet i ett segment (s.k migrationsanalyser). Vardera förändringen modelleras, analyseras och exemplifieras med visuella datautvinningstekniker, främst med självorganiserande kartor (SOM, eng. Self-Organizing Maps) och självorganiserande tidskartor (SOTM), en vidareutveckling av SOM. Visualiseringen anteciperas underlätta tolkningen av identifierade mönster och göra processen med kunskapsöverföring mellan den som gör analysen och beslutsfattaren smidigare. Asiakkuudenhallinta (CRM) eli organisaation muuttaminen tuotepainotteisesta asiakaskeskeiseksi on herättänyt mielenkiintoa niin yliopisto- kuin yritysmaailmassakin. Asiakkaiden käyttäytymistä ja toimintaa pystytään nykyään helposti tallentamaan ja varastoimaan toiminnanohjausjärjestelmien ja tietovarastojen avulla; asiakkaat jättävät jatkuvasti piirteistään ja ostokäyttäytymisestään kertovia tietojälkiä, joita voidaan analysoida. On tavallista, että asiakkaat poikkeavat toisistaan eri tavoin, ja heidän mieltymyksensä kuten myös ostokäyttäytymisensä saattavat olla hyvinkin erilaisia. Asiakaskäyttäytymisen monimuotoisuuteen ja tuottavuuteen paneuduttaessa käytetäänkin laajalti asiakassegmentointia eli asiakkaiden jakamista ryhmiin samankaltaisuuden perusteella. Perinteiset asiakassegmentoinnin ratkaisut ovat usein yksittäisiä analyyttisia malleja, jotka on tehty tietyn aikajakson perusteella. Tämän vuoksi ne monesti jättävät huomioimatta sen, että asiakkaiden käyttäytyminen saattaa ajan kuluessa muuttua. Tässä väitöskirjassa pyritäänkin tarjoamaan holistinen kuva asiakkaiden ominaisuuksista ja ostokäyttäytymisestä tarkastelemalla kahta muutosvoimaa tiettyyn aikarajaukseen perustuvien perinteisten segmentointimallien lisäksi. Nämä kaksi asiakassegmentointimallin dynamiikkaa ovat muutokset segmenttien rakenteessa ja muutokset yksittäisten asiakkaiden kuulumisessa ryhmään. Ensimmäistä dynamiikkaa lähestytään ajallisen asiakassegmentoinnin avulla, jossa visualisoidaan ajan kuluessa tapahtuvat muutokset segmenttien rakenteissa ja profiileissa. Toista dynamiikkaa taas lähestytään käyttäen nk. segmenttisiirtymien analyysia, jossa visuaalisin keinoin tunnistetaan samantyyppisesti segmentistä toiseen vaihtavat asiakkaat. Visualisoinnin tehtävänä on tukea havaittujen kaavojen tulkitsemista sekä helpottaa tiedonsiirtoa analysoijan ja päättäjien välillä. Visuaalisia tiedonlouhintamenetelmiä, kuten itseorganisoivia karttoja ja niiden laajennuksia, käytetään osoittamaan näiden menetelmien hyödyllisyys sekä asiakkuudenhallinnassa yleisesti että erityisesti asiakassegmentoinnissa.
Resumo:
Workshop at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
This thesis researches automatic traffic sign inventory and condition analysis using machine vision and pattern recognition methods. Automatic traffic sign inventory and condition analysis can be used to more efficient road maintenance, improving the maintenance processes, and to enable intelligent driving systems. Automatic traffic sign detection and classification has been researched before from the viewpoint of self-driving vehicles, driver assistance systems, and the use of signs in mapping services. Machine vision based inventory of traffic signs consists of detection, classification, localization, and condition analysis of traffic signs. The produced machine vision system performance is estimated with three datasets, from which two of have been been collected for this thesis. Based on the experiments almost all traffic signs can be detected, classified, and located and their condition analysed. In future, the inventory system performance has to be verified in challenging conditions and the system has to be pilot tested.