845 resultados para Computer aided analysis, Machine vision, Video surveillance
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
In response to the increasing international competitiveness, many manufacturing businesses are rethinking their management strategies and philosophies towards achieving a computer integrated environment. The explosive growth in Advanced Manufacturing Technology (AMI) has resulted in the formation of functional "Islands of Automation" such as Computer Aided Design (CAD), Computer Aided Manufacturing (CAM), Computer Aided Process Planning (CAPP) and Manufacturing Resources Planning (MRPII). This has resulted in an environment which has focussed areas of excellence and poor overall efficiency, co-ordination and control. The main role of Computer Integrated Manufacturing (CIM) is to integrate these islands of automation and develop a totally integrated and controlled environment. However, the various perceptions of CIM, although developing, remain focussed on a very narrow integration scope and have consequently resulted in mere linked islands of automation with little improvement in overall co-ordination and control. This thesis, that is the research described within, develops and examines a more holistic view of CIM, which is based on the integration of various business elements. One particular business element, namely control, has been shown to have a multi-facetted and underpinning relationship with the CIM philosophy. This relationship impacts various CIM system design aspects including the CIM business analysis and modelling technique, the specification of systems integration requirements, the CIM system architectural form and the degree of business redesign. The research findings show that fundamental changes to CIM system design are required; these are incorporated in a generic CIM design methodology. The affect and influence of this holistic view of CIM on a manufacturing business has been evaluated through various industrial case study applications. Based on the evidence obtained, it has been concluded that this holistic, control based approach to CIM can provide a greatly improved means of achieving a totally integrated and controlled business environment. This generic CIM methodology will therefore make a significant contribution to the planning, modelling, design and development of future CIM systems.
Resumo:
This thesis presents a study of how edges are detected and encoded by the human visual system. The study begins with theoretical work on the development of a model of edge processing, and includes psychophysical experiments on humans, and computer simulations of these experiments, using the model. The first chapter reviews the literature on edge processing in biological and machine vision, and introduces the mathematical foundations of this area of research. The second chapter gives a formal presentation of a model of edge perception that detects edges and characterizes their blur, contrast and orientation, using Gaussian derivative templates. This model has previously been shown to accurately predict human performance in blur matching tasks with several different types of edge profile. The model provides veridical estimates of the blur and contrast of edges that have a Gaussian integral profile. Since blur and contrast are independent parameters of Gaussian edges, the model predicts that varying one parameter should not affect perception of the other. Psychophysical experiments showed that this prediction is incorrect: reducing the contrast makes an edge look sharper; increasing the blur reduces the perceived contrast. Both of these effects can be explained by introducing a smoothed threshold to one of the processing stages of the model. It is shown that, with this modification,the model can predict the perceived contrast and blur of a number of edge profiles that differ markedly from the ideal Gaussian edge profiles on which the templates are based. With only a few exceptions, the results from all the experiments on blur and contrast perception can be explained reasonably well using one set of parameters for each subject. In the few cases where the model fails, possible extensions to the model are discussed.
Resumo:
Purpose - To generate a reflectance model of the fundus that allows an accurate non-invasive quantification of blood and pigments. Methods - A Monte Carlo simulation was used to produce a mathematical model of light interaction with the fundus at different wavelengths. The model predictions were compared with fundus images from normal volunteers in several spectral bands (peaks at 507, 525, 552, 585, 596 and 611nm). Th e model was then used to calculate the concentration and distribution of the known absorbing components of the fundus. Results - The shape of the statistical distribution of the image data generally corresponded to that of the model data; the model however appears to overestimate the reflectance of the fundus in the longer wavelength region.As the absorption by xanthophyll has no significant eff ect on light transport above 534nm, its distribution in the fundus was quantified: the wavelengths where both shape and distribution of image and model data matched (<553nm) were used to train a neural network which was then applied to every point in the image data. The xanthophyll distribution thus found was in agreement with published literature data in normal subjects. Conclusion - We have developed a method for optimising multi-spectral imaging of the fundus and a computer image analysis capable of estimating information about the structure and properties of the fundus. Th e technique successfully calculates the distribution of xanthophyll in the fundus of healthy volunteers. Further improvement of the model is required to allow the deduction of other parameters from images; investigations in known pathology models are also necessary to establish if this method is of clinical use in detecting early chroido-retinopathies, hence providing a useful screening and diagnostic tool.
Resumo:
The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^
Resumo:
The rapid growth of the Internet and the advancements of the Web technologies have made it possible for users to have access to large amounts of on-line music data, including music acoustic signals, lyrics, style/mood labels, and user-assigned tags. The progress has made music listening more fun, but has raised an issue of how to organize this data, and more generally, how computer programs can assist users in their music experience. An important subject in computer-aided music listening is music retrieval, i.e., the issue of efficiently helping users in locating the music they are looking for. Traditionally, songs were organized in a hierarchical structure such as genre->artist->album->track, to facilitate the users’ navigation. However, the intentions of the users are often hard to be captured in such a simply organized structure. The users may want to listen to music of a particular mood, style or topic; and/or any songs similar to some given music samples. This motivated us to work on user-centric music retrieval system to improve users’ satisfaction with the system. The traditional music information retrieval research was mainly concerned with classification, clustering, identification, and similarity search of acoustic data of music by way of feature extraction algorithms and machine learning techniques. More recently the music information retrieval research has focused on utilizing other types of data, such as lyrics, user-access patterns, and user-defined tags, and on targeting non-genre categories for classification, such as mood labels and styles. This dissertation focused on investigating and developing effective data mining techniques for (1) organizing and annotating music data with styles, moods and user-assigned tags; (2) performing effective analysis of music data with features from diverse information sources; and (3) recommending music songs to the users utilizing both content features and user access patterns.
Resumo:
Engineering analysis in geometric models has been the main if not the only credible/reasonable tool used by engineers and scientists to resolve physical boundaries problems. New high speed computers have facilitated the accuracy and validation of the expected results. In practice, an engineering analysis is composed of two parts; the design of the model and the analysis of the geometry with the boundary conditions and constraints imposed on it. Numerical methods are used to resolve a large number of physical boundary problems independent of the model geometry. The time expended due to the computational process are related to the imposed boundary conditions and the well conformed geometry. Any geometric model that contains gaps or open lines is considered an imperfect geometry model and major commercial solver packages are incapable of handling such inputs. Others packages apply different kinds of methods to resolve this problems like patching or zippering; but the final resolved geometry may be different from the original geometry, and the changes may be unacceptable. The study proposed in this dissertation is based on a new technique to process models with geometrical imperfection without the necessity to repair or change the original geometry. An algorithm is presented that is able to analyze the imperfect geometric model with the imposed boundary conditions using a meshfree method and a distance field approximation to the boundaries. Experiments are proposed to analyze the convergence of the algorithm in imperfect models geometries and will be compared with the same models but with perfect geometries. Plotting results will be presented for further analysis and conclusions of the algorithm convergence
Resumo:
Lung cancer is one of the most common types of cancer and has the highest mortality rate. Patient survival is highly correlated with early detection. Computed Tomography technology services the early detection of lung cancer tremendously by offering aminimally invasive medical diagnostic tool. However, the large amount of data per examination makes the interpretation difficult. This leads to omission of nodules by human radiologist. This thesis presents a development of a computer-aided diagnosis system (CADe) tool for the detection of lung nodules in Computed Tomography study. The system, called LCD-OpenPACS (Lung Cancer Detection - OpenPACS) should be integrated into the OpenPACS system and have all the requirements for use in the workflow of health facilities belonging to the SUS (Brazilian health system). The LCD-OpenPACS made use of image processing techniques (Region Growing and Watershed), feature extraction (Histogram of Gradient Oriented), dimensionality reduction (Principal Component Analysis) and classifier (Support Vector Machine). System was tested on 220 cases, totaling 296 pulmonary nodules, with sensitivity of 94.4% and 7.04 false positives per case. The total time for processing was approximately 10 minutes per case. The system has detected pulmonary nodules (solitary, juxtavascular, ground-glass opacity and juxtapleural) between 3 mm and 30 mm.
Resumo:
Objective
Pedestrian detection under video surveillance systems has always been a hot topic in computer vision research. These systems are widely used in train stations, airports, large commercial plazas, and other public places. However, pedestrian detection remains difficult because of complex backgrounds. Given its development in recent years, the visual attention mechanism has attracted increasing attention in object detection and tracking research, and previous studies have achieved substantial progress and breakthroughs. We propose a novel pedestrian detection method based on the semantic features under the visual attention mechanism.
Method
The proposed semantic feature-based visual attention model is a spatial-temporal model that consists of two parts: the static visual attention model and the motion visual attention model. The static visual attention model in the spatial domain is constructed by combining bottom-up with top-down attention guidance. Based on the characteristics of pedestrians, the bottom-up visual attention model of Itti is improved by intensifying the orientation vectors of elementary visual features to make the visual saliency map suitable for pedestrian detection. In terms of pedestrian attributes, skin color is selected as a semantic feature for pedestrian detection. The regional and Gaussian models are adopted to construct the skin color model. Skin feature-based visual attention guidance is then proposed to complete the top-down process. The bottom-up and top-down visual attentions are linearly combined using the proper weights obtained from experiments to construct the static visual attention model in the spatial domain. The spatial-temporal visual attention model is then constructed via the motion features in the temporal domain. Based on the static visual attention model in the spatial domain, the frame difference method is combined with optical flowing to detect motion vectors. Filtering is applied to process the field of motion vectors. The saliency of motion vectors can be evaluated via motion entropy to make the selected motion feature more suitable for the spatial-temporal visual attention model.
Result
Standard datasets and practical videos are selected for the experiments. The experiments are performed on a MATLAB R2012a platform. The experimental results show that our spatial-temporal visual attention model demonstrates favorable robustness under various scenes, including indoor train station surveillance videos and outdoor scenes with swaying leaves. Our proposed model outperforms the visual attention model of Itti, the graph-based visual saliency model, the phase spectrum of quaternion Fourier transform model, and the motion channel model of Liu in terms of pedestrian detection. The proposed model achieves a 93% accuracy rate on the test video.
Conclusion
This paper proposes a novel pedestrian method based on the visual attention mechanism. A spatial-temporal visual attention model that uses low-level and semantic features is proposed to calculate the saliency map. Based on this model, the pedestrian targets can be detected through focus of attention shifts. The experimental results verify the effectiveness of the proposed attention model for detecting pedestrians.
Resumo:
Virtual topology operations have been utilized to generate an analysis topology definition suitable for downstream mesh generation. Detailed descriptions are provided for virtual topology merge and split operations for all topological entities. Current virtual topology technology is extended to allow the virtual partitioning of volume cells and the topological queries required to carry out each operation are provided. Virtual representations are robustly linked to the underlying geometric definition through an analysis topology. The analysis topology and all associated virtual and topological dependencies are automatically updated after each virtual operation, providing the link to the underlying CAD geometry. Therefore, a valid description of the analysis topology, including relative orientations, is maintained. This enables downstream operations, such as the merging or partitioning of virtual entities, and interrogations, such as determining if a specific meshing strategy can be applied to the virtual volume cells, to be performed on the analysis topology description. As the virtual representation is a non-manifold description of the sub-divided domain the interfaces between cells are recorded automatically. This enables the advantages of non-manifold modelling to be exploited within the manifold modelling environment of a major commercial CAD system, without any adaptation of the underlying CAD model. A hierarchical virtual structure is maintained where virtual entities are merged or partitioned. This has a major benefit over existing solutions as the virtual dependencies are stored in an open and accessible manner, providing the analyst with the freedom to create, modify and edit the analysis topology in any preferred sequence, whilst the original CAD geometry is not disturbed. Robust definitions of the topological and virtual dependencies enable the same virtual topology definitions to be accessed, interrogated and manipulated within multiple different CAD packages and linked to the underlying geometry.
Resumo:
This papers examines the use of trajectory distance measures and clustering techniques to define normal
and abnormal trajectories in the context of pedestrian tracking in public spaces. In order to detect abnormal
trajectories, what is meant by a normal trajectory in a given scene is firstly defined. Then every trajectory
that deviates from this normality is classified as abnormal. By combining Dynamic Time Warping and a
modified K-Means algorithms for arbitrary-length data series, we have developed an algorithm for trajectory
clustering and abnormality detection. The final system performs with an overall accuracy of 83% and 75%
when tested in two different standard datasets.
Resumo:
Part 8: Business Strategies Alignment
Resumo:
A method for estimating the dimensions of non-delimited free parking areas by using a static surveillance camera is proposed. The proposed method is specially designed to tackle the main challenges of urban scenarios (multiple moving objects, outdoor illumination conditions and occlusions between vehicles) with no training. The core of this work is the temporal analysis of the video frames to detect the occupancy variation of the parking areas. Two techniques are combined: background subtraction using a mixture of Gaussians to detect and track vehicles and the creation of a transience map to detect the parking and leaving of vehicles. The authors demonstrate that the proposed method yields satisfactory estimates on three real scenarios while being a low computational cost solution that can be applied in any kind of parking area covered by a single camera.