964 resultados para video surveillance


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The term Ambient Intelligence (AmI) refers to a vision on the future of the information society where smart, electronic environment are sensitive and responsive to the presence of people and their activities (Context awareness). In an ambient intelligence world, devices work in concert to support people in carrying out their everyday life activities, tasks and rituals in an easy, natural way using information and intelligence that is hidden in the network connecting these devices. This promotes the creation of pervasive environments improving the quality of life of the occupants and enhancing the human experience. AmI stems from the convergence of three key technologies: ubiquitous computing, ubiquitous communication and natural interfaces. Ambient intelligent systems are heterogeneous and require an excellent cooperation between several hardware/software technologies and disciplines, including signal processing, networking and protocols, embedded systems, information management, and distributed algorithms. Since a large amount of fixed and mobile sensors embedded is deployed into the environment, the Wireless Sensor Networks is one of the most relevant enabling technologies for AmI. WSN are complex systems made up of a number of sensor nodes which can be deployed in a target area to sense physical phenomena and communicate with other nodes and base stations. These simple devices typically embed a low power computational unit (microcontrollers, FPGAs etc.), a wireless communication unit, one or more sensors and a some form of energy supply (either batteries or energy scavenger modules). WNS promises of revolutionizing the interactions between the real physical worlds and human beings. Low-cost, low-computational power, low energy consumption and small size are characteristics that must be taken into consideration when designing and dealing with WSNs. To fully exploit the potential of distributed sensing approaches, a set of challengesmust be addressed. Sensor nodes are inherently resource-constrained systems with very low power consumption and small size requirements which enables than to reduce the interference on the physical phenomena sensed and to allow easy and low-cost deployment. They have limited processing speed,storage capacity and communication bandwidth that must be efficiently used to increase the degree of local ”understanding” of the observed phenomena. A particular case of sensor nodes are video sensors. This topic holds strong interest for a wide range of contexts such as military, security, robotics and most recently consumer applications. Vision sensors are extremely effective for medium to long-range sensing because vision provides rich information to human operators. However, image sensors generate a huge amount of data, whichmust be heavily processed before it is transmitted due to the scarce bandwidth capability of radio interfaces. In particular, in video-surveillance, it has been shown that source-side compression is mandatory due to limited bandwidth and delay constraints. Moreover, there is an ample opportunity for performing higher-level processing functions, such as object recognition that has the potential to drastically reduce the required bandwidth (e.g. by transmitting compressed images only when something ‘interesting‘ is detected). The energy cost of image processing must however be carefully minimized. Imaging could play and plays an important role in sensing devices for ambient intelligence. Computer vision can for instance be used for recognising persons and objects and recognising behaviour such as illness and rioting. Having a wireless camera as a camera mote opens the way for distributed scene analysis. More eyes see more than one and a camera system that can observe a scene from multiple directions would be able to overcome occlusion problems and could describe objects in their true 3D appearance. In real-time, these approaches are a recently opened field of research. In this thesis we pay attention to the realities of hardware/software technologies and the design needed to realize systems for distributed monitoring, attempting to propose solutions on open issues and filling the gap between AmI scenarios and hardware reality. The physical implementation of an individual wireless node is constrained by three important metrics which are outlined below. Despite that the design of the sensor network and its sensor nodes is strictly application dependent, a number of constraints should almost always be considered. Among them: • Small form factor to reduce nodes intrusiveness. • Low power consumption to reduce battery size and to extend nodes lifetime. • Low cost for a widespread diffusion. These limitations typically result in the adoption of low power, low cost devices such as low powermicrocontrollers with few kilobytes of RAMand tenth of kilobytes of program memory with whomonly simple data processing algorithms can be implemented. However the overall computational power of the WNS can be very large since the network presents a high degree of parallelism that can be exploited through the adoption of ad-hoc techniques. Furthermore through the fusion of information from the dense mesh of sensors even complex phenomena can be monitored. In this dissertation we present our results in building several AmI applications suitable for a WSN implementation. The work can be divided into two main areas:Low Power Video Sensor Node and Video Processing Alghoritm and Multimodal Surveillance . Low Power Video Sensor Nodes and Video Processing Alghoritms In comparison to scalar sensors, such as temperature, pressure, humidity, velocity, and acceleration sensors, vision sensors generate much higher bandwidth data due to the two-dimensional nature of their pixel array. We have tackled all the constraints listed above and have proposed solutions to overcome the current WSNlimits for Video sensor node. We have designed and developed wireless video sensor nodes focusing on the small size and the flexibility of reuse in different applications. The video nodes target a different design point: the portability (on-board power supply, wireless communication), a scanty power budget (500mW),while still providing a prominent level of intelligence, namely sophisticated classification algorithmand high level of reconfigurability. We developed two different video sensor node: The device architecture of the first one is based on a low-cost low-power FPGA+microcontroller system-on-chip. The second one is based on ARM9 processor. Both systems designed within the above mentioned power envelope could operate in a continuous fashion with Li-Polymer battery pack and solar panel. Novel low power low cost video sensor nodes which, in contrast to sensors that just watch the world, are capable of comprehending the perceived information in order to interpret it locally, are presented. Featuring such intelligence, these nodes would be able to cope with such tasks as recognition of unattended bags in airports, persons carrying potentially dangerous objects, etc.,which normally require a human operator. Vision algorithms for object detection, acquisition like human detection with Support Vector Machine (SVM) classification and abandoned/removed object detection are implemented, described and illustrated on real world data. Multimodal surveillance: In several setup the use of wired video cameras may not be possible. For this reason building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. Energy efficiency for wireless smart camera networks is one of the major efforts in distributed monitoring and surveillance community. For this reason, building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. The Pyroelectric Infra-Red (PIR) sensors have been used to extend the lifetime of a solar-powered video sensor node by providing an energy level dependent trigger to the video camera and the wireless module. Such approach has shown to be able to extend node lifetime and possibly result in continuous operation of the node.Being low-cost, passive (thus low-power) and presenting a limited form factor, PIR sensors are well suited for WSN applications. Moreover techniques to have aggressive power management policies are essential for achieving long-termoperating on standalone distributed cameras needed to improve the power consumption. We have used an adaptive controller like Model Predictive Control (MPC) to help the system to improve the performances outperforming naive power management policies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

WNS-affected bats did so over similar time frames as WNSunaffected bats. The behaviors of bats with WNS did not change as drastically as expected. Thereseems to be little to no effect on their ability to fly/forage until much later stages of the disease when they are likely near death. WNS-affected bats are grooming more which could be altering the way they use energy reserves during hibernation possibly leading tostarvation and eventually death. The decreased likelihood of arousals in response to external cues may be the result of spending more energy during previous and increasingly frequent arousals. While it is clear that WNS does result in changes in behavior whether these changes are directly in response to fungal skin infection or to some other component of the syndrome such as decreased energy availability or loss of homeostasis is unknown. bat behavior, white-nose syndrome, behavior, video surveillance, arousal patterns White-Nose Syndrome (WNS) is a disease of hibernating bats caused by the fungal pathogen Geomyces destructans. The fungus, which was first noted in 2006, invades bats wings and other exposed membranes, eventually resulting in death. Researchers have yet to understand many aspects of this disease, including basic etiology and epidemiology. There is also a lack of information on how fungal infection may change the behavior of healthy bats during hibernation or how changes in behavior may influence disease progression. Based upon the physiological changes that are known to occur in affected bats, and upon anecdotal observations of aberrant behavior in these bats, I hypothesized that WNS would significantly change the behavior of the little brown myotis (Myotis lucifugus). My research examined the behavior of hibernating bats during arousals from torpor. I compared WNS-affected and unaffected bats, in the field and incaptivity, using motion-sensitive infrared cameras. Flight maneuverability and echolocation were also tested between WNS-affected and unaffected bats during arousalsfrom hibernation to detect changes in the bats' ability to perform basic locomotion or potentially catch insect prey. Lastly, hibernating bats were artificially disturbed and theirarousal patterns were monitored to examine changes in the response to external stimuli between WNS-affected and unaffected bats.Bats with WNS groomed for longer periods of time after arousing from torpor, both in the field and in captivity. They also engaged in longer periods of any sort of activity during these arousals. There were no changes in acoustical signaling during flight tests and changes in flight maneuverability were only found in bats were seen staging" near the entrance of the mine which is itself a unique behavior exhibited by affected bats. At this point these bats were likely near death and could barely fly at all. In response toexternal stimuli bats with WNS were less likely to arouse than unaffected bats. However when they did arouse WNS-affected bats did so over similar time frames as WNSunaffected bats. The behaviors of bats with WNS did not change as drastically as expected. Thereseems to be little to no effect on their ability to fly/forage until much later stages of the disease when they are likely near death. WNS-affected bats are grooming more which could be altering the way they use energy reserves during hibernation possibly leading tostarvation and eventually death. The decreased likelihood of arousals in response to external cues may be the result of spending more energy during previous and increasingly frequent arousals. While it is clear that WNS does result in changes in behavior whetherthese changes are directly in response to fungal skin infection or to some other component of the syndrome such as decreased energy availability or loss of homeostasis is unknown."

Relevância:

60.00% 60.00%

Publicador:

Resumo:

En este Proyecto de fin de carrera titulado: LA VÍDEOVIGILANCIA: TECNOLOGÍAS ACTUALES Y ASPECTOS SOCIOPOLÍTICOS, tiene como objetivo hacer un estudio en los sistemas de Vídeovigilancia basado en cámaras-IP, con fines de seguridad, control o supervisión. Nos basaremos en exponer los sistemas Vídeovigilancia basados en cámara-IP actuales de ultima generación, cuya principal virtud de estos sistemas, es la comunicación con otros lugares, o espacios públicos como privados y poder visualizar tanto en vivo como en diferido lo que este pasando en ese lugar y en ese momento o haya pasado a través del protocolo de comunicación-IP. Se explicara desde el más básico al más complejo sistema de videovigilancia-IP, también explicaremos su puesta en practica mediante los múltiples interconexiones que estos conlleven. Llegando a este punto, se nos plantea las siguientes cuestiones que da origen a este PFC. Estos sistemas de Vídeovigilancia-IP, captan las imágenes por medio de las cámaras-IP, proporcionando su facilidad tanto de visionado/grabacion, como de control, ya que no es necesario estar presente e interactuando con otros sistemas digitales de diverso índole actuales, gracias al protocolo-IP. Estos sistemas-IP, tienen su puesta en práctica mediante las instalaciones requeridas ,estas podrán ser sencillas o muy complejas de todos los sistemas-IP. Debido al gran aumento masivo, las tecnologías actuales de diverso índole de cámaras-IP en materia de la vídeovigilancia en lugares públicos, y privados en nuestra sociedad actual, lo hace un medio particularmente invasivo y por ello resulta necesario tanto la concurrencia de condiciones que legitimen los tratamientos de datos de personas identificables, como la definición de los principios y garantías que deban aplicarse ya que estas, repercutirán sobre los derechos de las personas, lo que obligara a fijar ciertas garantías. Se nos plantea los casos en los que la captación y/o tratamiento de imágenes con fines de Vídeovigilancia que pertenezcan a personas identificadas o identificables, ha obligado a España, y según dispuesto por la Directiva 95/46/CE del Parlamento Europeo, a regularizar esta situación mediante la Ley Orgánica de Protección de Datos (LOPD) 15/1999 de 13 de diciembre, bajo los procedimientos del Estado español en materia sociopolítica, y dando vigor a esta ley, mediante la aprobación de la Instrucción 1/2006 de 8 de noviembre de 2006, cuyo máximo organismo es la Agencia española de Protección de Datos (AGPD). Una vez planteada la motivación y justificación del proyecto, se derivan unos objetivos a cumplir con la realización del mismo. Los objetivos del proyecto se pueden diferenciar en dos clases principalmente. Los objetivos principales y objetivos secundarios. Los objetivos principales de este PFC, nacen directamente de las necesidades planteadas originalmente en materia de Vídeovigilancia, tanto tecnológicamente basado en las cámaras-IP en la captación y/o tratamiento de imágenes, así como sociopolíticamente donde trataremos de describirlo mediante las indicaciones y criterios con casos prácticos y de cómo deben de aplicarse según la instrucción 1/2006 mediante la LOPD en materia de Vídeovigilancia, en cuanto a la protección de datos que puedan repercutir sobre el derecho de las personas. Por otra parte los objetivos secundarios, son la extensión del objetivo primario y son de orden cuantificador en este PFC, dando una explicación más exhaustiva del objetivo principal. ABSTRACT In this final year project, entitled: THE VIDEOSURVEILLANCE: CURRENT TECHNOLOGIES AND POLITICALSOCIALS ASPECTS, aims to make a study of video surveillance systems based on IP cameras, for security, control or supervision. We will rely on to expose the camera based video surveillance systems IP-current last generation, whose main virtue of these systems, is communication with other places, or public and private spaces and to view both live and time so this happening in that place and at that time or passed through-IP communication protocol. He explained from the most basic to the most complex-IP video surveillance system, also explain its implementation into practice through multiple interconnections that these entail. Arriving at this point, we face the following issues which gave rise to this PFC. These IP-video surveillance systems, captured images through IP-cameras, providing both ease of viewing / recording, as a control, since it is not necessary to be present and interacting with other digital systems such diverse today, thanks IP-protocol. These systems-IP, have their implementation through the facilities required, these can be simple or very complex all-IP video surveillance systems. Due to the large increase in mass, current technologies of different kinds of IP cameras for video surveillance in public places, and private in our society, it makes a particularly invasive and therefore attendance is necessary both conditions that legitimize data processing of identifiable people, as the definition of the principles and safeguards to be applied as these will impact on the rights of the people, which forced to set certain guarantees. We face those cases in which the uptake and / or image processing video surveillance purposes belonging to identified or identifiable, has forced Spain, and as required by Directive 95/46/EC of the European Parliament, to regularize this situation by the Organic Law on Data Protection (LOPD) 15/1999 of December 13, under the procedures of the Spanish State in sociopolitical, and giving effect to this Act, with the approval of the Instruction 1/2006 of 8 November 2006, the governing body is the Spanish Agency for Data Protection (AGPD). Once raised the motivation and justification for the project, resulting in meeting targets to achieve the same. Project objectives can be differentiated into two main classes, the main objectives and secondary objectives: The main objectives of this PFC, born directly from requirements originally raised for capturing both technologically imaging me and try to describe where sociopolitically, the details and criteria as case studies and should be applied according to the instruction 1 / 2006 by the LOPD on video surveillance system in terms of data protection that could impact on the right people. Moreover the secondary objectives are the extension of the primary and are of a quantifier in this PFC, giving a fuller explanation of the main objective.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La segmentación de imágenes es un campo importante de la visión computacional y una de las áreas de investigación más activas, con aplicaciones en comprensión de imágenes, detección de objetos, reconocimiento facial, vigilancia de vídeo o procesamiento de imagen médica. La segmentación de imágenes es un problema difícil en general, pero especialmente en entornos científicos y biomédicos, donde las técnicas de adquisición imagen proporcionan imágenes ruidosas. Además, en muchos de estos casos se necesita una precisión casi perfecta. En esta tesis, revisamos y comparamos primero algunas de las técnicas ampliamente usadas para la segmentación de imágenes médicas. Estas técnicas usan clasificadores a nivel de pixel e introducen regularización sobre pares de píxeles que es normalmente insuficiente. Estudiamos las dificultades que presentan para capturar la información de alto nivel sobre los objetos a segmentar. Esta deficiencia da lugar a detecciones erróneas, bordes irregulares, configuraciones con topología errónea y formas inválidas. Para solucionar estos problemas, proponemos un nuevo método de regularización de alto nivel que aprende información topológica y de forma a partir de los datos de entrenamiento de una forma no paramétrica usando potenciales de orden superior. Los potenciales de orden superior se están popularizando en visión por computador, pero la representación exacta de un potencial de orden superior definido sobre muchas variables es computacionalmente inviable. Usamos una representación compacta de los potenciales basada en un conjunto finito de patrones aprendidos de los datos de entrenamiento que, a su vez, depende de las observaciones. Gracias a esta representación, los potenciales de orden superior pueden ser convertidos a potenciales de orden 2 con algunas variables auxiliares añadidas. Experimentos con imágenes reales y sintéticas confirman que nuestro modelo soluciona los errores de aproximaciones más débiles. Incluso con una regularización de alto nivel, una precisión exacta es inalcanzable, y se requeire de edición manual de los resultados de la segmentación automática. La edición manual es tediosa y pesada, y cualquier herramienta de ayuda es muy apreciada. Estas herramientas necesitan ser precisas, pero también lo suficientemente rápidas para ser usadas de forma interactiva. Los contornos activos son una buena solución: son buenos para detecciones precisas de fronteras y, en lugar de buscar una solución global, proporcionan un ajuste fino a resultados que ya existían previamente. Sin embargo, requieren una representación implícita que les permita trabajar con cambios topológicos del contorno, y esto da lugar a ecuaciones en derivadas parciales (EDP) que son costosas de resolver computacionalmente y pueden presentar problemas de estabilidad numérica. Presentamos una aproximación morfológica a la evolución de contornos basada en un nuevo operador morfológico de curvatura que es válido para superficies de cualquier dimensión. Aproximamos la solución numérica de la EDP de la evolución de contorno mediante la aplicación sucesiva de un conjunto de operadores morfológicos aplicados sobre una función de conjuntos de nivel. Estos operadores son muy rápidos, no sufren de problemas de estabilidad numérica y no degradan la función de los conjuntos de nivel, de modo que no hay necesidad de reinicializarlo. Además, su implementación es mucho más sencilla que la de las EDP, ya que no requieren usar sofisticados algoritmos numéricos. Desde un punto de vista teórico, profundizamos en las conexiones entre operadores morfológicos y diferenciales, e introducimos nuevos resultados en este área. Validamos nuestra aproximación proporcionando una implementación morfológica de los contornos geodésicos activos, los contornos activos sin bordes, y los turbopíxeles. En los experimentos realizados, las implementaciones morfológicas convergen a soluciones equivalentes a aquéllas logradas mediante soluciones numéricas tradicionales, pero con ganancias significativas en simplicidad, velocidad y estabilidad. ABSTRACT Image segmentation is an important field in computer vision and one of its most active research areas, with applications in image understanding, object detection, face recognition, video surveillance or medical image processing. Image segmentation is a challenging problem in general, but especially in the biological and medical image fields, where the imaging techniques usually produce cluttered and noisy images and near-perfect accuracy is required in many cases. In this thesis we first review and compare some standard techniques widely used for medical image segmentation. These techniques use pixel-wise classifiers and introduce weak pairwise regularization which is insufficient in many cases. We study their difficulties to capture high-level structural information about the objects to segment. This deficiency leads to many erroneous detections, ragged boundaries, incorrect topological configurations and wrong shapes. To deal with these problems, we propose a new regularization method that learns shape and topological information from training data in a nonparametric way using high-order potentials. High-order potentials are becoming increasingly popular in computer vision. However, the exact representation of a general higher order potential defined over many variables is computationally infeasible. We use a compact representation of the potentials based on a finite set of patterns learned fromtraining data that, in turn, depends on the observations. Thanks to this representation, high-order potentials can be converted into pairwise potentials with some added auxiliary variables and minimized with tree-reweighted message passing (TRW) and belief propagation (BP) techniques. Both synthetic and real experiments confirm that our model fixes the errors of weaker approaches. Even with high-level regularization, perfect accuracy is still unattainable, and human editing of the segmentation results is necessary. The manual edition is tedious and cumbersome, and tools that assist the user are greatly appreciated. These tools need to be precise, but also fast enough to be used in real-time. Active contours are a good solution: they are good for precise boundary detection and, instead of finding a global solution, they provide a fine tuning to previously existing results. However, they require an implicit representation to deal with topological changes of the contour, and this leads to PDEs that are computationally costly to solve and may present numerical stability issues. We present a morphological approach to contour evolution based on a new curvature morphological operator valid for surfaces of any dimension. We approximate the numerical solution of the contour evolution PDE by the successive application of a set of morphological operators defined on a binary level-set. These operators are very fast, do not suffer numerical stability issues, and do not degrade the level set function, so there is no need to reinitialize it. Moreover, their implementation is much easier than their PDE counterpart, since they do not require the use of sophisticated numerical algorithms. From a theoretical point of view, we delve into the connections between differential andmorphological operators, and introduce novel results in this area. We validate the approach providing amorphological implementation of the geodesic active contours, the active contours without borders, and turbopixels. In the experiments conducted, the morphological implementations converge to solutions equivalent to those achieved by traditional numerical solutions, but with significant gains in simplicity, speed, and stability.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In many classification problems, it is necessary to consider the specific location of an n-dimensional space from which features have been calculated. For example, considering the location of features extracted from specific areas of a two-dimensional space, as an image, could improve the understanding of a scene for a video surveillance system. In the same way, the same features extracted from different locations could mean different actions for a 3D HCI system. In this paper, we present a self-organizing feature map able to preserve the topology of locations of an n-dimensional space in which the vector of features have been extracted. The main contribution is to implicitly preserving the topology of the original space because considering the locations of the extracted features and their topology could ease the solution to certain problems. Specifically, the paper proposes the n-dimensional constrained self-organizing map preserving the input topology (nD-SOM-PINT). Features in adjacent areas of the n-dimensional space, used to extract the feature vectors, are explicitly in adjacent areas of the nD-SOM-PINT constraining the neural network structure and learning. As a study case, the neural network has been instantiate to represent and classify features as trajectories extracted from a sequence of images into a high level of semantic understanding. Experiments have been thoroughly carried out using the CAVIAR datasets (Corridor, Frontal and Inria) taken into account the global behaviour of an individual in order to validate the ability to preserve the topology of the two-dimensional space to obtain high-performance classification for trajectory classification in contrast of non-considering the location of features. Moreover, a brief example has been included to focus on validate the nD-SOM-PINT proposal in other domain than the individual trajectory. Results confirm the high accuracy of the nD-SOM-PINT outperforming previous methods aimed to classify the same datasets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Several studies have tried to find countermeasures against musculoskeletal de-conditioning during bed-rest, but none of them yielded decisive results. We hypothesised that resistive vibration exercise (RVE) might be a suitable training modality. We have therefore carried out a bed-rest study to evaluate its feasibility and efficacy during 56 days of bed-rest. Twenty healthy male volunteers aged 24 to 43 years were recruited and, after medical check-ups, randomised to a non-exercising control (Ctrl) group or a group that performed RVE 11 times per week. Strict bed-rest was controlled by video surveillance. The diet was controlled. RVE was performed in supine position, with a static force component of about twice the body weight and a smaller dynamic force component. RVE comprised four different units (squats, heel raises, toe raises, kicks), each of which lasted 60 - 100 seconds. Pre and post exercise levels of lactate were measured once weekly. Body weight was measured daily on a bed scale. Pain questionnaires were obtained in regular intervals during and after the bed-rest. Vibration frequency was set to 19 Hz at the beginning and progressed to 25.9 Hz (SD 1.9) at the end of the study, suggesting that the dynamic force component increased by 90%. The maximum sustainable exercise time for squat exercise increased from 86 s (SD 21) on day 11 of the BR to 176 s (SD 73) on day 53 (p = 0.006). On the same days, post-exercise lactate levels increased from 6.9 mmol/l (SD2.3) to 9.2 mmol/l (SD 3.5, p = 0.01). On average, body weight was unchanged in both groups during bed-rest, but single individuals in both groups depicted significant weight changes ranging from -10% to + d10% (p < 0.001). Lower limb pain was more frequent during bed-rest in the RVE subjects than in Ctrl (p = 0.035). During early recovery, subjects of both groups suffered from muscle pain to a comparable extent, but foot pain was more common in Ctrl than in RVE (p = 0.013 for plantar pain, p = 0.074 for dorsal foot pain). Our results indicate that RVE is feasible twice daily during bed-rest in young healthy males, provided that one afternoon and one entire day per week are free. Exercise progression, mainly by progression of vibration frequency, yielded increases in maximum sustainable exercise time and blood lactate. In conclusion, RVE as performed in this study, appears to be safe.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Trophic downgrading of ecosystems necessitates a functional understanding of trophic cascades. Identifying the presence of cascades, and the mechanisms through which they occur, is particularly important for seagrass meadows, which are among the most threatened ecosystems on Earth. Shark Bay, Western Australia provides a model system to investigate the potential importance of top-down effects in a relatively pristine seagrass ecosystem. The role of megagrazers in the Shark Bay system has been previously investigated, but the role of macrograzers (i.e., teleosts), and their importance relative to megagrazers, remains unknown. The objective of my dissertation was to elucidate the importance of teleost macrograzers in transmitting top-down effects in seagrass ecosystems. Seagrasses and macroalgae were the main food of the abundant teleost Pelates octolineatus, but stable isotopic values suggested that algae may contribute a larger portion of assimilated food than suggested by gut contents. Pelates octolineatus is at risk from numerous predators, with pied cormorants (Phalacrocorax varius) taking the majority of tethered P. octolineatus. Using a combination of fish trapping and unbaited underwater video surveillance, I found that the relative abundance of P. octolineatus was greater in interior areas of seagrass banks during the cold season, and that the mean length of P. octolineatus was greater in these areas compared to along edges of banks. Finally, I used seagrass transplants and exclosure experiments to determine the relative effect of megagrazers and macrograzers on the establishment and persistence of three species of seagrasses in interior microhabitats. Teleost grazing had the largest impact on seagrass species with the highest nutrient content, and these impacts were primarily observed during the warm season. My findings are consistent with predictions of a behaviorally-mediated trophic cascade initiated by tiger sharks (Galeocerdo cuvier) and transmitted through herbivorous fishes and their predators.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents the novel theory for performing multi-agent activity recognition without requiring large training corpora. The reduced need for data means that robust probabilistic recognition can be performed within domains where annotated datasets are traditionally unavailable. Complex human activities are composed from sequences of underlying primitive activities. We do not assume that the exact temporal ordering of primitives is necessary, so can represent complex activity using an unordered bag. Our three-tier architecture comprises low-level video tracking, event analysis and high-level inference. High-level inference is performed using a new, cascading extension of the Rao–Blackwellised Particle Filter. Simulated annealing is used to identify pairs of agents involved in multi-agent activity. We validate our framework using the benchmarked PETS 2006 video surveillance dataset and our own sequences, and achieve a mean recognition F-Score of 0.82. Our approach achieves a mean improvement of 17% over a Hidden Markov Model baseline.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Objective
Pedestrian detection under video surveillance systems has always been a hot topic in computer vision research. These systems are widely used in train stations, airports, large commercial plazas, and other public places. However, pedestrian detection remains difficult because of complex backgrounds. Given its development in recent years, the visual attention mechanism has attracted increasing attention in object detection and tracking research, and previous studies have achieved substantial progress and breakthroughs. We propose a novel pedestrian detection method based on the semantic features under the visual attention mechanism.
Method
The proposed semantic feature-based visual attention model is a spatial-temporal model that consists of two parts: the static visual attention model and the motion visual attention model. The static visual attention model in the spatial domain is constructed by combining bottom-up with top-down attention guidance. Based on the characteristics of pedestrians, the bottom-up visual attention model of Itti is improved by intensifying the orientation vectors of elementary visual features to make the visual saliency map suitable for pedestrian detection. In terms of pedestrian attributes, skin color is selected as a semantic feature for pedestrian detection. The regional and Gaussian models are adopted to construct the skin color model. Skin feature-based visual attention guidance is then proposed to complete the top-down process. The bottom-up and top-down visual attentions are linearly combined using the proper weights obtained from experiments to construct the static visual attention model in the spatial domain. The spatial-temporal visual attention model is then constructed via the motion features in the temporal domain. Based on the static visual attention model in the spatial domain, the frame difference method is combined with optical flowing to detect motion vectors. Filtering is applied to process the field of motion vectors. The saliency of motion vectors can be evaluated via motion entropy to make the selected motion feature more suitable for the spatial-temporal visual attention model.
Result
Standard datasets and practical videos are selected for the experiments. The experiments are performed on a MATLAB R2012a platform. The experimental results show that our spatial-temporal visual attention model demonstrates favorable robustness under various scenes, including indoor train station surveillance videos and outdoor scenes with swaying leaves. Our proposed model outperforms the visual attention model of Itti, the graph-based visual saliency model, the phase spectrum of quaternion Fourier transform model, and the motion channel model of Liu in terms of pedestrian detection. The proposed model achieves a 93% accuracy rate on the test video.
Conclusion
This paper proposes a novel pedestrian method based on the visual attention mechanism. A spatial-temporal visual attention model that uses low-level and semantic features is proposed to calculate the saliency map. Based on this model, the pedestrian targets can be detected through focus of attention shifts. The experimental results verify the effectiveness of the proposed attention model for detecting pedestrians.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Trophic downgrading of ecosystems necessitates a functional understanding of trophic cascades. Identifying the presence of cascades, and the mechanisms through which they occur, is particularly important for seagrass meadows, which are among the most threatened ecosystems on Earth. Shark Bay, Western Australia provides a model system to investigate the potential importance of top-down effects in a relatively pristine seagrass ecosystem. The role of megagrazers in the Shark Bay system has been previously investigated, but the role of macrograzers (i.e., teleosts), and their importance relative to megagrazers, remains unknown. The objective of my dissertation was to elucidate the importance of teleost macrograzers in transmitting top-down effects in seagrass ecosystems. Seagrasses and macroalgae were the main food of the abundant teleost Pelates octolineatus, but stable isotopic values suggested that algae may contribute a larger portion of assimilated food than suggested by gut contents. Pelates octolineatus is at risk from numerous predators, with pied cormorants (Phalacrocorax varius) taking the majority of tethered P. octolineatus. Using a combination of fish trapping and unbaited underwater video surveillance, I found that the relative abundance of P. octolineatus was greater in interior areas of seagrass banks during the cold season, and that the mean length of P. octolineatus was greater in these areas compared to along edges of banks. Finally, I used seagrass transplants and exclosure experiments to determine the relative effect of megagrazers and macrograzers on the establishment and persistence of three species of seagrasses in interior microhabitats. Teleost grazing had the largest impact on seagrass species with the highest nutrient content, and these impacts were primarily observed during the warm season. My findings are consistent with predictions of a behaviorally-mediated trophic cascade initiated by tiger sharks (Galeocerdo cuvier) and transmitted through herbivorous fishes and their predators.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Surveillance networks are typically monitored by a few people, viewing several monitors displaying the camera feeds. It is then very difficult for a human operator to effectively detect events as they happen. Recently, computer vision research has begun to address ways to automatically process some of this data, to assist human operators. Object tracking, event recognition, crowd analysis and human identification at a distance are being pursued as a means to aid human operators and improve the security of areas such as transport hubs. The task of object tracking is key to the effective use of more advanced technologies. To recognize an event people and objects must be tracked. Tracking also enhances the performance of tasks such as crowd analysis or human identification. Before an object can be tracked, it must be detected. Motion segmentation techniques, widely employed in tracking systems, produce a binary image in which objects can be located. However, these techniques are prone to errors caused by shadows and lighting changes. Detection routines often fail, either due to erroneous motion caused by noise and lighting effects, or due to the detection routines being unable to split occluded regions into their component objects. Particle filters can be used as a self contained tracking system, and make it unnecessary for the task of detection to be carried out separately except for an initial (often manual) detection to initialise the filter. Particle filters use one or more extracted features to evaluate the likelihood of an object existing at a given point each frame. Such systems however do not easily allow for multiple objects to be tracked robustly, and do not explicitly maintain the identity of tracked objects. This dissertation investigates improvements to the performance of object tracking algorithms through improved motion segmentation and the use of a particle filter. A novel hybrid motion segmentation / optical flow algorithm, capable of simultaneously extracting multiple layers of foreground and optical flow in surveillance video frames is proposed. The algorithm is shown to perform well in the presence of adverse lighting conditions, and the optical flow is capable of extracting a moving object. The proposed algorithm is integrated within a tracking system and evaluated using the ETISEO (Evaluation du Traitement et de lInterpretation de Sequences vidEO - Evaluation for video understanding) database, and significant improvement in detection and tracking performance is demonstrated when compared to a baseline system. A Scalable Condensation Filter (SCF), a particle filter designed to work within an existing tracking system, is also developed. The creation and deletion of modes and maintenance of identity is handled by the underlying tracking system; and the tracking system is able to benefit from the improved performance in uncertain conditions arising from occlusion and noise provided by a particle filter. The system is evaluated using the ETISEO database. The dissertation then investigates fusion schemes for multi-spectral tracking systems. Four fusion schemes for combining a thermal and visual colour modality are evaluated using the OTCBVS (Object Tracking and Classification in and Beyond the Visible Spectrum) database. It is shown that a middle fusion scheme yields the best results and demonstrates a significant improvement in performance when compared to a system using either mode individually. Findings from the thesis contribute to improve the performance of semi-automated video processing and therefore improve security in areas under surveillance.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Identifying an individual from surveillance video is a difficult, time consuming and labour intensive process. The proposed system aims to streamline this process by filtering out unwanted scenes and enhancing an individual's face through super-resolution. An automatic face recognition system is then used to identify the subject or present the human operator with likely matches from a database. A person tracker is used to speed up the subject detection and super-resolution process by tracking moving subjects and cropping a region of interest around the subject's face to reduce the number and size of the image frames to be super-resolved respectively. In this paper, experiments have been conducted to demonstrate how the optical flow super-resolution method used improves surveillance imagery for visual inspection as well as automatic face recognition on an Eigenface and Elastic Bunch Graph Matching system. The optical flow based method has also been benchmarked against the ``hallucination'' algorithm, interpolation methods and the original low-resolution images. Results show that both super-resolution algorithms improved recognition rates significantly. Although the hallucination method resulted in slightly higher recognition rates, the optical flow method produced less artifacts and more visually correct images suitable for human consumption.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Characteristics of surveillance video generally include low resolution and poor quality due to environmental, storage and processing limitations. It is extremely difficult for computers and human operators to identify individuals from these videos. To overcome this problem, super-resolution can be used in conjunction with an automated face recognition system to enhance the spatial resolution of video frames containing the subject and narrow down the number of manual verifications performed by the human operator by presenting a list of most likely candidates from the database. As the super-resolution reconstruction process is ill-posed, visual artifacts are often generated as a result. These artifacts can be visually distracting to humans and/or affect machine recognition algorithms. While it is intuitive that higher resolution should lead to improved recognition accuracy, the effects of super-resolution and such artifacts on face recognition performance have not been systematically studied. This paper aims to address this gap while illustrating that super-resolution allows more accurate identification of individuals from low-resolution surveillance footage. The proposed optical flow-based super-resolution method is benchmarked against Baker et al.’s hallucination and Schultz et al.’s super-resolution techniques on images from the Terrascope and XM2VTS databases. Ground truth and interpolated images were also tested to provide a baseline for comparison. Results show that a suitable super-resolution system can improve the discriminability of surveillance video and enhance face recognition accuracy. The experiments also show that Schultz et al.’s method fails when dealing surveillance footage due to its assumption of rigid objects in the scene. The hallucination and optical flow-based methods performed comparably, with the optical flow-based method producing less visually distracting artifacts that interfered with human recognition.