939 resultados para visual object detection


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Classic identity negative priming (NP) refers to the finding that when an object is ignored, subsequent naming responses to it are slower than when it has not been previously ignored (Tipper, S.P., 1985. The negative priming effect: inhibitory priming by ignored objects. Q. J. Exp. Psychol. 37A, 571-590). It is unclear whether this phenomenon arises due to the involvement of abstract semantic representations that the ignored object accesses automatically. Contemporary connectionist models propose a key role for the anterior temporal cortex in the representation of abstract semantic knowledge (e.g., McClelland, J.L., Rogers, T.T., 2003. The parallel distributed processing approach to semantic cognition. Nat. Rev. Neurosci. 4, 310-322), suggesting that this region should be involved during performance of the classic identity NP task if it involves semantic access. Using high-field (4 T) event-related functional magnetic resonance imaging, we observed increased BOLD responses in the left anterolateral temporal cortex including the temporal pole that was directly related to the magnitude of each individual's NP effect, supporting a semantic locus. Additional signal increases were observed in the supplementary eye fields (SEF) and left inferior parietal lobule (IPL). (c) 2006 Elsevier Inc. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis deals with the challenging problem of designing systems able to perceive objects in underwater environments. In the last few decades research activities in robotics have advanced the state of art regarding intervention capabilities of autonomous systems. State of art in fields such as localization and navigation, real time perception and cognition, safe action and manipulation capabilities, applied to ground environments (both indoor and outdoor) has now reached such a readiness level that it allows high level autonomous operations. On the opposite side, the underwater environment remains a very difficult one for autonomous robots. Water influences the mechanical and electrical design of systems, interferes with sensors by limiting their capabilities, heavily impacts on data transmissions, and generally requires systems with low power consumption in order to enable reasonable mission duration. Interest in underwater applications is driven by needs of exploring and intervening in environments in which human capabilities are very limited. Nowadays, most underwater field operations are carried out by manned or remotely operated vehicles, deployed for explorations and limited intervention missions. Manned vehicles, directly on-board controlled, expose human operators to risks related to the stay in field of the mission, within a hostile environment. Remotely Operated Vehicles (ROV) currently represent the most advanced technology for underwater intervention services available on the market. These vehicles can be remotely operated for long time but they need support from an oceanographic vessel with multiple teams of highly specialized pilots. Vehicles equipped with multiple state-of-art sensors and capable to autonomously plan missions have been deployed in the last ten years and exploited as observers for underwater fauna, seabed, ship wrecks, and so on. On the other hand, underwater operations like object recovery and equipment maintenance are still challenging tasks to be conducted without human supervision since they require object perception and localization with much higher accuracy and robustness, to a degree seldom available in Autonomous Underwater Vehicles (AUV). This thesis reports the study, from design to deployment and evaluation, of a general purpose and configurable platform dedicated to stereo-vision perception in underwater environments. Several aspects related to the peculiar environment characteristics have been taken into account during all stages of system design and evaluation: depth of operation and light conditions, together with water turbidity and external weather, heavily impact on perception capabilities. The vision platform proposed in this work is a modular system comprising off-the-shelf components for both the imaging sensors and the computational unit, linked by a high performance ethernet network bus. The adopted design philosophy aims at achieving high flexibility in terms of feasible perception applications, that should not be as limited as in case of a special-purpose and dedicated hardware. Flexibility is required by the variability of underwater environments, with water conditions ranging from clear to turbid, light backscattering varying with daylight and depth, strong color distortion, and other environmental factors. Furthermore, the proposed modular design ensures an easier maintenance and update of the system over time. Performance of the proposed system, in terms of perception capabilities, has been evaluated in several underwater contexts taking advantage of the opportunity offered by the MARIS national project. Design issues like energy power consumption, heat dissipation and network capabilities have been evaluated in different scenarios. Finally, real-world experiments, conducted in multiple and variable underwater contexts, including open sea waters, have led to the collection of several datasets that have been publicly released to the scientific community. The vision system has been integrated in a state of the art AUV equipped with a robotic arm and gripper, and has been exploited in the robot control loop to successfully perform underwater grasping operations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we consider the task of recognizing epigraphs in images such as photos taken using mobile devices. Given a set of 17,155 photos related to 14,560 epigraphs, we used a k-NearestNeighbor approach in order to perform the recognition. The contribution of this work is in evaluating state-of-the-art visual object recognition techniques in this specific context. The experimental results conducted show that Vector of Locally Aggregated Descriptors obtained aggregating SIFT descriptors is the best choice for this task.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Light microscopic analysis of diatom frustules is widely used both in basic and applied research, notably taxonomy, morphometrics, water quality monitoring and paleo-environmental studies. In these applications, usually large numbers of frustules need to be identified and / or measured. Although there is a need for automation in these applications, and image processing and analysis methods supporting these tasks have previously been developed, they did not become widespread in diatom analysis. While methodological reports for a wide variety of methods for image segmentation, diatom identification and feature extraction are available, no single implementation combining a subset of these into a readily applicable workflow accessible to diatomists exists. Results: The newly developed tool SHERPA offers a versatile image processing workflow focused on the identification and measurement of object outlines, handling all steps from image segmentation over object identification to feature extraction, and providing interactive functions for reviewing and revising results. Special attention was given to ease of use, applicability to a broad range of data and problems, and supporting high throughput analyses with minimal manual intervention. Conclusions: Tested with several diatom datasets from different sources and of various compositions, SHERPA proved its ability to successfully analyze large amounts of diatom micrographs depicting a broad range of species. SHERPA is unique in combining the following features: application of multiple segmentation methods and selection of the one giving the best result for each individual object; identification of shapes of interest based on outline matching against a template library; quality scoring and ranking of resulting outlines supporting quick quality checking; extraction of a wide range of outline shape descriptors widely used in diatom studies and elsewhere; minimizing the need for, but enabling manual quality control and corrections. Although primarily developed for analyzing images of diatom valves originating from automated microscopy, SHERPA can also be useful for other object detection, segmentation and outline-based identification problems.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This work explores the use of statistical methods in describing and estimating camera poses, as well as the information feedback loop between camera pose and object detection. Surging development in robotics and computer vision has pushed the need for algorithms that infer, understand, and utilize information about the position and orientation of the sensor platforms when observing and/or interacting with their environment.

The first contribution of this thesis is the development of a set of statistical tools for representing and estimating the uncertainty in object poses. A distribution for representing the joint uncertainty over multiple object positions and orientations is described, called the mirrored normal-Bingham distribution. This distribution generalizes both the normal distribution in Euclidean space, and the Bingham distribution on the unit hypersphere. It is shown to inherit many of the convenient properties of these special cases: it is the maximum-entropy distribution with fixed second moment, and there is a generalized Laplace approximation whose result is the mirrored normal-Bingham distribution. This distribution and approximation method are demonstrated by deriving the analytical approximation to the wrapped-normal distribution. Further, it is shown how these tools can be used to represent the uncertainty in the result of a bundle adjustment problem.

Another application of these methods is illustrated as part of a novel camera pose estimation algorithm based on object detections. The autocalibration task is formulated as a bundle adjustment problem using prior distributions over the 3D points to enforce the objects' structure and their relationship with the scene geometry. This framework is very flexible and enables the use of off-the-shelf computational tools to solve specialized autocalibration problems. Its performance is evaluated using a pedestrian detector to provide head and foot location observations, and it proves much faster and potentially more accurate than existing methods.

Finally, the information feedback loop between object detection and camera pose estimation is closed by utilizing camera pose information to improve object detection in scenarios with significant perspective warping. Methods are presented that allow the inverse perspective mapping traditionally applied to images to be applied instead to features computed from those images. For the special case of HOG-like features, which are used by many modern object detection systems, these methods are shown to provide substantial performance benefits over unadapted detectors while achieving real-time frame rates, orders of magnitude faster than comparable image warping methods.

The statistical tools and algorithms presented here are especially promising for mobile cameras, providing the ability to autocalibrate and adapt to the camera pose in real time. In addition, these methods have wide-ranging potential applications in diverse areas of computer vision, robotics, and imaging.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract Ordnance Survey, our national mapping organisation, collects vast amounts of high-resolution aerial imagery covering the entirety of the country. Currently, photogrammetrists and surveyors use this to manually capture real-world objects and characteristics for a relatively small number of features. Arguably, the vast archive of imagery that we have obtained portraying the whole of Great Britain is highly underutilised and could be ‘mined’ for much more information. Over the last year the ImageLearn project has investigated the potential of "representation learning" to automatically extract relevant features from aerial imagery. Representation learning is a form of data-mining in which the feature-extractors are learned using machine-learning techniques, rather than being manually defined. At the beginning of the project we conjectured that representations learned could help with processes such as object detection and identification, change detection and social landscape regionalisation of Britain. This seminar will give an overview of the project and highlight some of our research results.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Terrestrial remote sensing imagery involves the acquisition of information from the Earth's surface without physical contact with the area under study. Among the remote sensing modalities, hyperspectral imaging has recently emerged as a powerful passive technology. This technology has been widely used in the fields of urban and regional planning, water resource management, environmental monitoring, food safety, counterfeit drugs detection, oil spill and other types of chemical contamination detection, biological hazards prevention, and target detection for military and security purposes [2-9]. Hyperspectral sensors sample the reflected solar radiation from the Earth surface in the portion of the spectrum extending from the visible region through the near-infrared and mid-infrared (wavelengths between 0.3 and 2.5 µm) in hundreds of narrow (of the order of 10 nm) contiguous bands [10]. This high spectral resolution can be used for object detection and for discriminating between different objects based on their spectral xharacteristics [6]. However, this huge spectral resolution yields large amounts of data to be processed. For example, the Airbone Visible/Infrared Imaging Spectrometer (AVIRIS) [11] collects a 512 (along track) X 614 (across track) X 224 (bands) X 12 (bits) data cube in 5 s, corresponding to about 140 MBs. Similar data collection ratios are achieved by other spectrometers [12]. Such huge data volumes put stringent requirements on communications, storage, and processing. The problem of signal sbspace identification of hyperspectral data represents a crucial first step in many hypersctral processing algorithms such as target detection, change detection, classification, and unmixing. The identification of this subspace enables a correct dimensionality reduction (DR) yelding gains in data storage and retrieval and in computational time and complexity. Additionally, DR may also improve algorithms performance since it reduce data dimensionality without losses in the useful signal components. The computation of statistical estimates is a relevant example of the advantages of DR, since the number of samples required to obtain accurate estimates increases drastically with the dimmensionality of the data (Hughes phnomenon) [13].

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background - Image blurring in Full Field Digital Mammography (FFDM) is reported to be a problem within many UK breast screening units resulting in significant proportion of technical repeats/recalls. Our study investigates monitors of differing pixel resolution, and whether there is a difference in blurring detection between a 2.3 MP technical review monitor and a 5MP standard reporting monitor. Methods - Simulation software was created to induce different magnitudes of blur on 20 artifact free FFDM screening images. 120 blurred and non-blurred images were randomized and displayed on the 2.3 and 5MP monitors; they were reviewed by 28 trained observers. Monitors were calibrated to the DICOM Grayscale Standard Display Function. T-test was used to determine whether significant differences exist in blurring detection between the monitors. Results - The blurring detection rate on the 2.3MP monitor for 0.2, 0.4, 0.6, 0.8 and 1 mm blur was 46, 59, 66, 77and 78% respectively; and on the 5MP monitor 44, 70, 83 , 96 and 98%. All the non-motion images were identified correctly. A statistical difference (p <0.01) in the blurring detection rate between the two monitors was demonstrated. Conclusions - Given the results of this study and knowing that monitors as low as 1 MP are used in clinical practice, we speculate that technical recall/repeat rates because of blurring could be reduced if higher resolution monitors are used for technical review at the time of imaging. Further work is needed to determine monitor minimum specification for visual blurring detection.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

One of the most significant research topics in computer vision is object detection. Most of the reported object detection results localise the detected object within a bounding box, but do not explicitly label the edge contours of the object. Since object contours provide a fundamental diagnostic of object shape, some researchers have initiated work on linear contour feature representations for object detection and localisation. However, linear contour feature-based localisation is highly dependent on the performance of linear contour detection within natural images, and this can be perturbed significantly by a cluttered background. In addition, the conventional approach to achieving rotation-invariant features is to rotate the feature receptive field to align with the local dominant orientation before computing the feature representation. Grid resampling after rotation adds extra computational cost and increases the total time consumption for computing the feature descriptor. Though it is not an expensive process if using current computers, it is appreciated that if each step of the implementation is faster to compute especially when the number of local features is increasing and the application is implemented on resource limited ”smart devices”, such as mobile phones, in real-time. Motivated by the above issues, a 2D object localisation system is proposed in this thesis that matches features of edge contour points, which is an alternative method that takes advantage of the shape information for object localisation. This is inspired by edge contour points comprising the basic components of shape contours. In addition, edge point detection is usually simpler to achieve than linear edge contour detection. Therefore, the proposed localization system could avoid the need for linear contour detection and reduce the pathological disruption from the image background. Moreover, since natural images usually comprise many more edge contour points than interest points (i.e. corner points), we also propose new methods to generate rotation-invariant local feature descriptors without pre-rotating the feature receptive field to improve the computational efficiency of the whole system. In detail, the 2D object localisation system is achieved by matching edge contour points features in a constrained search area based on the initial pose-estimate produced by a prior object detection process. The local feature descriptor obtains rotation invariance by making use of rotational symmetry of the hexagonal structure. Therefore, a set of local feature descriptors is proposed based on the hierarchically hexagonal grouping structure. Ultimately, the 2D object localisation system achieves a very promising performance based on matching the proposed features of edge contour points with the mean correct labelling rate of the edge contour points 0.8654 and the mean false labelling rate 0.0314 applied on the data from Amsterdam Library of Object Images (ALOI). Furthermore, the proposed descriptors are evaluated by comparing to the state-of-the-art descriptors and achieve competitive performances in terms of pose estimate with around half-pixel pose error.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A ilustração aplicada ao branding resulta de um modo reflexivo por parte do autor. Esse modo é, por si só, o papel do ilustrador como designer gráfico. A identidade de uma marca nasce da sua história, contexto e sensações, as quais o autor adquire e transmite, segundo as suas vivências, de modo a responder às necessidades das pessoas que o rodeiam. O desenvolvimento de uma marca é um longo processo de análise e reflexão, contínuo e exigente. Aplicando a ilustração a este meio, como objeto visual principal, a língua deixa de ser um entrave e a identidade passa a ser comunicada aos olhos e memória de qualquer um, de forma imediata e eficaz. Conceptualmente, a Tinta Barroca absorve estes princípios, transformando-se numa marca de eventos culturais, embora bastante focada em eventos que podem abranger jantares bem portugueses ou provas de vinho. O projeto foi desenvolvido à base do experimentalismo. Todas as ilustrações da marca foram, numa primeira fase, produzidas manualmente e posteriormente tratadas digitalmente, testando diferentes formas, texturas e materiais. A excessividade ilustrativa é o ponto de partida para comunicar as ideologias da Tinta Barroca, baseando-se no barroquismo, erotismo e nos prazeres da vida. A identidade gráfica da marca misturase com uma decoração já pré-definida: uma mesa bem preenchida e recheada de flores, frutos, vinho e comidas divinais, que se aproximam, pelo excesso, dos princípios do barroco.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Automatic video segmentation plays a vital role in sports videos annotation. This paper presents a fully automatic and computationally efficient algorithm for analysis of sports videos. Various methods of automatic shot boundary detection have been proposed to perform automatic video segmentation. These investigations mainly concentrate on detecting fades and dissolves for fast processing of the entire video scene without providing any additional feedback on object relativity within the shots. The goal of the proposed method is to identify regions that perform certain activities in a scene. The model uses some low-level feature video processing algorithms to extract the shot boundaries from a video scene and to identify dominant colours within these boundaries. An object classification method is used for clustering the seed distributions of the dominant colours to homogeneous regions. Using a simple tracking method a classification of these regions to active or static is performed. The efficiency of the proposed framework is demonstrated over a standard video benchmark with numerous types of sport events and the experimental results show that our algorithm can be used with high accuracy for automatic annotation of active regions for sport videos.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The main objectives of this thesis are to validate an improved principal components analysis (IPCA) algorithm on images; designing and simulating a digital model for image compression, face recognition and image detection by using a principal components analysis (PCA) algorithm and the IPCA algorithm; designing and simulating an optical model for face recognition and object detection by using the joint transform correlator (JTC); establishing detection and recognition thresholds for each model; comparing between the performance of the PCA algorithm and the performance of the IPCA algorithm in compression, recognition and, detection; and comparing between the performance of the digital model and the performance of the optical model in recognition and detection. The MATLAB © software was used for simulating the models. PCA is a technique used for identifying patterns in data and representing the data in order to highlight any similarities or differences. The identification of patterns in data of high dimensions (more than three dimensions) is too difficult because the graphical representation of data is impossible. Therefore, PCA is a powerful method for analyzing data. IPCA is another statistical tool for identifying patterns in data. It uses information theory for improving PCA. The joint transform correlator (JTC) is an optical correlator used for synthesizing a frequency plane filter for coherent optical systems. The IPCA algorithm, in general, behaves better than the PCA algorithm in the most of the applications. It is better than the PCA algorithm in image compression because it obtains higher compression, more accurate reconstruction, and faster processing speed with acceptable errors; in addition, it is better than the PCA algorithm in real-time image detection due to the fact that it achieves the smallest error rate as well as remarkable speed. On the other hand, the PCA algorithm performs better than the IPCA algorithm in face recognition because it offers an acceptable error rate, easy calculation, and a reasonable speed. Finally, in detection and recognition, the performance of the digital model is better than the performance of the optical model.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Collecting and analysing data is an important element in any field of human activity and research. Even in sports, collecting and analyzing statistical data is attracting a growing interest. Some exemplar use cases are: improvement of technical/tactical aspects for team coaches, definition of game strategies based on the opposite team play or evaluation of the performance of players. Other advantages are related to taking more precise and impartial judgment in referee decisions: a wrong decision can change the outcomes of important matches. Finally, it can be useful to provide better representations and graphic effects that make the game more engaging for the audience during the match. Nowadays it is possible to delegate this type of task to automatic software systems that can use cameras or even hardware sensors to collect images or data and process them. One of the most efficient methods to collect data is to process the video images of the sporting event through mixed techniques concerning machine learning applied to computer vision. As in other domains in which computer vision can be applied, the main tasks in sports are related to object detection, player tracking, and to the pose estimation of athletes. The goal of the present thesis is to apply different models of CNNs to analyze volleyball matches. Starting from video frames of a volleyball match, we reproduce a bird's eye view of the playing court where all the players are projected, reporting also for each player the type of action she/he is performing.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Il seguente elaborato affronta l'implementazione di un algoritmo che affronta un problema di controllo di processo in ambito industriale utilizzando algoritmi di object detection. Infatti, il progetto concordato con il professore Di Stefano si è svolto in collaborazione con l’azienda Pirelli, nell’ambito della produzione di pneumatici. Lo scopo dell'algoritmo implementato è di verificare il preciso orientamento di elementi grafici della copertura, utilizzati dalle case automobilistiche per equipaggiare correttamente le vetture. In particolare, si devono individuare delle scritte sul battistrada della copertura e identificarne la posizione rispetto ad altri elementi fissati su di essa. La tesi affronta questo task in due parti distinte: la prima consiste nel training di algoritmi di deep learning per il riconoscimento degli elementi grafici e del battistrada, la seconda è un decisore che opera a valle del primo sistema utilizzando gli output delle reti allenate.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This report describes the realization of a system, in which an object detection model will be implemented, whose aim is to detect the presence of people in images. This system could be used for several applications: for example, it could be carried on board an aircraft or a drone. In this case, the system is designed in such a way that it can be mounted on light/medium weight helicopters, helping the operator to find people in emergency situations. In the first chapter the use of helicopters for civil protection is analysed and applications similar to this case study are listed. The second chapter describes the choice of the hardware devices that have been used to implement a prototype of a system to collect, analyse and display images. At first, the PC necessary to process the images was chosen, based on the characteristics of the algorithms that are necessary to run the analysis. In the further, a camera that could be compatible with the PC was selected. Finally, the battery pack was chosen taking into account the electrical consumption of the devices. The third chapter illustrates the algorithms used for image analysis. In the fourth, some of the requirements listed in the regulations that must be taken into account for carrying on board all the devices have been briefly analysed. In the fifth chapter the activity of design and modelling, with the CAD Solidworks, the devices and a prototype of a case that will house them is described. The sixth chapter discusses the additive manufacturing, since the case was printed exploiting this technology. In the seventh chapter, part of the tests that must be carried out on the equipment to certificate it have been analysed, and some simulations have been carried out. In the eighth chapter the results obtained once loaded the object detection model on a hardware for image analyses were showed. In the ninth chapter, conclusions and future applications were discussed.