59 resultados para video object segmentation

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis, we propose to infer pixel-level labelling in video by utilising only object category information, exploiting the intrinsic structure of video data. Our motivation is the observation that image-level labels are much more easily to be acquired than pixel-level labels, and it is natural to find a link between the image level recognition and pixel level classification in video data, which would transfer learned recognition models from one domain to the other one. To this end, this thesis proposes two domain adaptation approaches to adapt the deep convolutional neural network (CNN) image recognition model trained from labelled image data to the target domain exploiting both semantic evidence learned from CNN, and the intrinsic structures of unlabelled video data. Our proposed approaches explicitly model and compensate for the domain adaptation from the source domain to the target domain which in turn underpins a robust semantic object segmentation method for natural videos. We demonstrate the superior performance of our methods by presenting extensive evaluations on challenging datasets comparing with the state-of-the-art methods.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The usage of digital content, such as video clips and images, has increased dramatically during the last decade. Local image features have been applied increasingly in various image and video retrieval applications. This thesis evaluates local features and applies them to image and video processing tasks. The results of the study show that 1) the performance of different local feature detector and descriptor methods vary significantly in object class matching, 2) local features can be applied in image alignment with superior results against the state-of-the-art, 3) the local feature based shot boundary detection method produces promising results, and 4) the local feature based hierarchical video summarization method shows promising new new research direction. In conclusion, this thesis presents the local features as a powerful tool in many applications and the imminent future work should concentrate on improving the quality of the local features.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The large and growing number of digital images is making manual image search laborious. Only a fraction of the images contain metadata that can be used to search for a particular type of image. Thus, the main research question of this thesis is whether it is possible to learn visual object categories directly from images. Computers process images as long lists of pixels that do not have a clear connection to high-level semantics which could be used in the image search. There are various methods introduced in the literature to extract low-level image features and also approaches to connect these low-level features with high-level semantics. One of these approaches is called Bag-of-Features which is studied in the thesis. In the Bag-of-Features approach, the images are described using a visual codebook. The codebook is built from the descriptions of the image patches using clustering. The images are described by matching descriptions of image patches with the visual codebook and computing the number of matches for each code. In this thesis, unsupervised visual object categorisation using the Bag-of-Features approach is studied. The goal is to find groups of similar images, e.g., images that contain an object from the same category. The standard Bag-of-Features approach is improved by using spatial information and visual saliency. It was found that the performance of the visual object categorisation can be improved by using spatial information of local features to verify the matches. However, this process is computationally heavy, and thus, the number of images must be limited in the spatial matching, for example, by using the Bag-of-Features method as in this study. Different approaches for saliency detection are studied and a new method based on the Hessian-Affine local feature detector is proposed. The new method achieves comparable results with current state-of-the-art. The visual object categorisation performance was improved by using foreground segmentation based on saliency information, especially when the background could be considered as clutter.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Video transcoding refers to the process of converting a digital video from one format into another format. It is a compute-intensive operation. Therefore, transcoding of a large number of simultaneous video streams requires a large amount of computing resources. Moreover, to handle di erent load conditions in a cost-e cient manner, the video transcoding service should be dynamically scalable. Infrastructure as a Service Clouds currently offer computing resources, such as virtual machines, under the pay-per-use business model. Thus the IaaS Clouds can be leveraged to provide a coste cient, dynamically scalable video transcoding service. To use computing resources e ciently in a cloud computing environment, cost-e cient virtual machine provisioning is required to avoid overutilization and under-utilization of virtual machines. This thesis presents proactive virtual machine resource allocation and de-allocation algorithms for video transcoding in cloud computing. Since users' requests for videos may change at di erent times, a check is required to see if the current computing resources are adequate for the video requests. Therefore, the work on admission control is also provided. In addition to admission control, temporal resolution reduction is used to avoid jitters in a video. Furthermore, in a cloud computing environment such as Amazon EC2, the computing resources are more expensive as compared with the storage resources. Therefore, to avoid repetition of transcoding operations, a transcoded video needs to be stored for a certain time. To store all videos for the same amount of time is also not cost-e cient because popular transcoded videos have high access rate while unpopular transcoded videos are rarely accessed. This thesis provides a cost-e cient computation and storage trade-o strategy, which stores videos in the video repository as long as it is cost-e cient to store them. This thesis also proposes video segmentation strategies for bit rate reduction and spatial resolution reduction video transcoding. The evaluation of proposed strategies is performed using a message passing interface based video transcoder, which uses a coarse-grain parallel processing approach where video is segmented at group of pictures level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The problem of automatic recognition of the fish from the video sequences is discussed in this Master’s Thesis. This is a very urgent issue for many organizations engaged in fish farming in Finland and Russia because the process of automation control and counting of individual species is turning point in the industry. The difficulties and the specific features of the problem have been identified in order to find a solution and propose some recommendations for the components of the automated fish recognition system. Methods such as background subtraction, Kalman filtering and Viola-Jones method were implemented during this work for detection, tracking and estimation of fish parameters. Both the results of the experiments and the choice of the appropriate methods strongly depend on the quality and the type of a video which is used as an input data. Practical experiments have demonstrated that not all methods can produce good results for real data, whereas on synthetic data they operate satisfactorily.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Selostus: Yksinkertainen viljelymenetelmä naudan alkioiden aikaviivenauhoitusta varten

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tässä työssä raportoidaan harjoitustyön kehittäminen ja toteuttaminen Aktiivisen- ja robottinäön kurssille. Harjoitustyössä suunnitellaan ja toteutetaan järjestelmä joka liikuttaa kappaleita robottikäsivarrella kolmiuloitteisessa avaruudessa. Kappaleidenpaikkojen määrittämiseen järjestelmä käyttää digitaalisia kuvia. Tässä työssä esiteltävässä harjoitustyötoteutuksessa käytettiin raja-arvoistusta HSV-väriavaruudessa kappaleiden segmentointiin kuvasta niiden värien perusteella. Segmentoinnin tuloksena saatavaa binäärikuvaa suodatettiin mediaanisuotimella kuvan häiriöiden poistamiseksi. Kappaleen paikkabinäärikuvassa määritettiin nimeämällä yhtenäisiä pikseliryhmiä yhtenäisen alueen nimeämismenetelmällä. Kappaleen paikaksi määritettiin suurimman nimetyn pikseliryhmän paikka. Kappaleiden paikat kuvassa yhdistettiin kolmiuloitteisiin koordinaatteihin kalibroidun kameran avulla. Järjestelmä liikutti kappaleita niiden arvioitujen kolmiuloitteisten paikkojen perusteella.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Perceiving the world visually is a basic act for humans, but for computers it is still an unsolved problem. The variability present innatural environments is an obstacle for effective computer vision. The goal of invariant object recognition is to recognise objects in a digital image despite variations in, for example, pose, lighting or occlusion. In this study, invariant object recognition is considered from the viewpoint of feature extraction. Thedifferences between local and global features are studied with emphasis on Hough transform and Gabor filtering based feature extraction. The methods are examined with respect to four capabilities: generality, invariance, stability, and efficiency. Invariant features are presented using both Hough transform and Gabor filtering. A modified Hough transform technique is also presented where the distortion tolerance is increased by incorporating local information. In addition, methods for decreasing the computational costs of the Hough transform employing parallel processing and local information are introduced.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tässä insinöörityössä esitellään Stadian verkkoviestinnän VIDEOS-hankkeeseen liittyvän web-pohjaisen videoeditorin kehitys ja käytetyt teknologiat. Fooga-nimiseksi nimetty videoeditorin käyttämät tekniikat ovat Ruby, Ruby on Rails, FFmpeg, Mencoder, ImageMagick ja FLVTool2. Ruby on olio-pohjainen skriptikieli, Ruby on Rails on websovelluskehys ja muut tekniikat ovat komentorivipohjaisia työkaluja, jotka tarjoavat tärkeimmät toiminnallisuudet Foogalle. Tavoitteina oli tämän työn yhteydessä ohjelmoida Foogaan perustoiminnallisuudet, jotka mahdollistavat minimaaliset käyttömahdollisuudet kevääseen 2007 mennessä. Kehitystyö jatkuu vuoteen 2009 asti tarjoamalla samalla mahdollisuuden usealle insinöörityölle tekniikan ja liikenteen koulutusohjelmasta. Tämän lisäksi tässä insinöörityössä perehdytään Object-Relational Mapping-tekniikan perusteisiiin ja verrataan Ruby on Railsin ja Javan ORM-ominaisuuksia. Ruby on Railsin osalta esitellään ActiveRecord-luokka ja Javan osalta Hibernate, jonka johdantona on DAO/DTO-sunnittelumalli.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tässä työssä on esitetty sen ohjelmiston kehittämisen prosessi, joka on tarkoitettu annettavien palveluiden valvottavaksi käyttäen prototyyppimallia. Raportti sisältää vaatimusten, kohteisiin suunnatun analyysin ja suunnittelun, realisointiprosessien kuvauksen ja prototyypin testauksen. Ohjelmiston käyttöala – antavien palveluiden valvonta. Vaatimukset sovellukselle analysoitiin ohjelmistomarkkinoiden perusteella sekä ohjelmiston engineeringin periaatteiden mukaisesti. Ohjelmiston prototyyppi on realisoitu käyttäen asiakas-/palvelinhybridimallia sekä ralaatiokantaa. Kehitetty ohjelmisto on tarkoitettu venäläisille tietokonekerhoille, jotka erikoistuvat pelipalvelinten antamiseen.