847 resultados para Binary image
Resumo:
La segmentación de imágenes es un campo importante de la visión computacional y una de las áreas de investigación más activas, con aplicaciones en comprensión de imágenes, detección de objetos, reconocimiento facial, vigilancia de vídeo o procesamiento de imagen médica. La segmentación de imágenes es un problema difícil en general, pero especialmente en entornos científicos y biomédicos, donde las técnicas de adquisición imagen proporcionan imágenes ruidosas. Además, en muchos de estos casos se necesita una precisión casi perfecta. En esta tesis, revisamos y comparamos primero algunas de las técnicas ampliamente usadas para la segmentación de imágenes médicas. Estas técnicas usan clasificadores a nivel de pixel e introducen regularización sobre pares de píxeles que es normalmente insuficiente. Estudiamos las dificultades que presentan para capturar la información de alto nivel sobre los objetos a segmentar. Esta deficiencia da lugar a detecciones erróneas, bordes irregulares, configuraciones con topología errónea y formas inválidas. Para solucionar estos problemas, proponemos un nuevo método de regularización de alto nivel que aprende información topológica y de forma a partir de los datos de entrenamiento de una forma no paramétrica usando potenciales de orden superior. Los potenciales de orden superior se están popularizando en visión por computador, pero la representación exacta de un potencial de orden superior definido sobre muchas variables es computacionalmente inviable. Usamos una representación compacta de los potenciales basada en un conjunto finito de patrones aprendidos de los datos de entrenamiento que, a su vez, depende de las observaciones. Gracias a esta representación, los potenciales de orden superior pueden ser convertidos a potenciales de orden 2 con algunas variables auxiliares añadidas. Experimentos con imágenes reales y sintéticas confirman que nuestro modelo soluciona los errores de aproximaciones más débiles. Incluso con una regularización de alto nivel, una precisión exacta es inalcanzable, y se requeire de edición manual de los resultados de la segmentación automática. La edición manual es tediosa y pesada, y cualquier herramienta de ayuda es muy apreciada. Estas herramientas necesitan ser precisas, pero también lo suficientemente rápidas para ser usadas de forma interactiva. Los contornos activos son una buena solución: son buenos para detecciones precisas de fronteras y, en lugar de buscar una solución global, proporcionan un ajuste fino a resultados que ya existían previamente. Sin embargo, requieren una representación implícita que les permita trabajar con cambios topológicos del contorno, y esto da lugar a ecuaciones en derivadas parciales (EDP) que son costosas de resolver computacionalmente y pueden presentar problemas de estabilidad numérica. Presentamos una aproximación morfológica a la evolución de contornos basada en un nuevo operador morfológico de curvatura que es válido para superficies de cualquier dimensión. Aproximamos la solución numérica de la EDP de la evolución de contorno mediante la aplicación sucesiva de un conjunto de operadores morfológicos aplicados sobre una función de conjuntos de nivel. Estos operadores son muy rápidos, no sufren de problemas de estabilidad numérica y no degradan la función de los conjuntos de nivel, de modo que no hay necesidad de reinicializarlo. Además, su implementación es mucho más sencilla que la de las EDP, ya que no requieren usar sofisticados algoritmos numéricos. Desde un punto de vista teórico, profundizamos en las conexiones entre operadores morfológicos y diferenciales, e introducimos nuevos resultados en este área. Validamos nuestra aproximación proporcionando una implementación morfológica de los contornos geodésicos activos, los contornos activos sin bordes, y los turbopíxeles. En los experimentos realizados, las implementaciones morfológicas convergen a soluciones equivalentes a aquéllas logradas mediante soluciones numéricas tradicionales, pero con ganancias significativas en simplicidad, velocidad y estabilidad. ABSTRACT Image segmentation is an important field in computer vision and one of its most active research areas, with applications in image understanding, object detection, face recognition, video surveillance or medical image processing. Image segmentation is a challenging problem in general, but especially in the biological and medical image fields, where the imaging techniques usually produce cluttered and noisy images and near-perfect accuracy is required in many cases. In this thesis we first review and compare some standard techniques widely used for medical image segmentation. These techniques use pixel-wise classifiers and introduce weak pairwise regularization which is insufficient in many cases. We study their difficulties to capture high-level structural information about the objects to segment. This deficiency leads to many erroneous detections, ragged boundaries, incorrect topological configurations and wrong shapes. To deal with these problems, we propose a new regularization method that learns shape and topological information from training data in a nonparametric way using high-order potentials. High-order potentials are becoming increasingly popular in computer vision. However, the exact representation of a general higher order potential defined over many variables is computationally infeasible. We use a compact representation of the potentials based on a finite set of patterns learned fromtraining data that, in turn, depends on the observations. Thanks to this representation, high-order potentials can be converted into pairwise potentials with some added auxiliary variables and minimized with tree-reweighted message passing (TRW) and belief propagation (BP) techniques. Both synthetic and real experiments confirm that our model fixes the errors of weaker approaches. Even with high-level regularization, perfect accuracy is still unattainable, and human editing of the segmentation results is necessary. The manual edition is tedious and cumbersome, and tools that assist the user are greatly appreciated. These tools need to be precise, but also fast enough to be used in real-time. Active contours are a good solution: they are good for precise boundary detection and, instead of finding a global solution, they provide a fine tuning to previously existing results. However, they require an implicit representation to deal with topological changes of the contour, and this leads to PDEs that are computationally costly to solve and may present numerical stability issues. We present a morphological approach to contour evolution based on a new curvature morphological operator valid for surfaces of any dimension. We approximate the numerical solution of the contour evolution PDE by the successive application of a set of morphological operators defined on a binary level-set. These operators are very fast, do not suffer numerical stability issues, and do not degrade the level set function, so there is no need to reinitialize it. Moreover, their implementation is much easier than their PDE counterpart, since they do not require the use of sophisticated numerical algorithms. From a theoretical point of view, we delve into the connections between differential andmorphological operators, and introduce novel results in this area. We validate the approach providing amorphological implementation of the geodesic active contours, the active contours without borders, and turbopixels. In the experiments conducted, the morphological implementations converge to solutions equivalent to those achieved by traditional numerical solutions, but with significant gains in simplicity, speed, and stability.
Resumo:
The main problem to study vertical drainage from the moisture distribution, on a vertisol profile, is searching for suitable methods using these procedures. Our aim was to design a digital image processing methodology and its analysis to characterize the moisture content distribution of a vertisol profile. In this research, twelve soil pits were excavated on a ba re Mazic Pellic Vertisols ix of them in May 13/2011 and the rest in May 19 /2011 after a moderate rainfall event. Digital RGB images were taken from each vertisol pit using a Kodak? camera selecting a size of 1600x945 pixels. Each soil image was processed to homogenized brightness and then a spatial filter with several window sizes was applied to select the optimum one. The RGB image obtained were divided in each matrix color selecting the best thresholds for each one, maximum and minimum, to be applied and get a digital binary pattern. This one was analyzed by estimating two fractal scaling exponents box counting dimension D BC) and interface fractal dimension (D) In addition, three pre-fractal scaling coefficients were determinate at maximum resolution: total number of boxes intercepting the foreground pattern (A), fractal lacunarity (?1) and Shannon entropy S1). For all the images processed the spatial filter 9x9 was the optimum based on entropy, cluster and histogram criteria. Thresholds for each color were selected based on bimodal histograms.
Resumo:
A new technology is being proposed as a solution to the problem of unintentional facial detection and recognition in pictures in which the individuals appearing want to express their privacy preferences, through the use of different tags. The existing methods for face de-identification were mostly ad hoc solutions that only provided an absolute binary solution in a privacy context such as pixelation, or a bar mask. As the number and users of social networks are increasing, our preferences regarding our privacy may become more complex, leaving these absolute binary solutions as something obsolete. The proposed technology overcomes this problem by embedding information in a tag which will be placed close to the face without being disruptive. Through a decoding method the tag will provide the preferences that will be applied to the images in further stages.
Resumo:
A more natural, intuitive, user-friendly, and less intrusive Human–Computer interface for controlling an application by executing hand gestures is presented. For this purpose, a robust vision-based hand-gesture recognition system has been developed, and a new database has been created to test it. The system is divided into three stages: detection, tracking, and recognition. The detection stage searches in every frame of a video sequence potential hand poses using a binary Support Vector Machine classifier and Local Binary Patterns as feature vectors. These detections are employed as input of a tracker to generate a spatio-temporal trajectory of hand poses. Finally, the recognition stage segments a spatio-temporal volume of data using the obtained trajectories, and compute a video descriptor called Volumetric Spatiograms of Local Binary Patterns (VS-LBP), which is delivered to a bank of SVM classifiers to perform the gesture recognition. The VS-LBP is a novel video descriptor that constitutes one of the most important contributions of the paper, which is able to provide much richer spatio-temporal information than other existing approaches in the state of the art with a manageable computational cost. Excellent results have been obtained outperforming other approaches of the state of the art.
Resumo:
This work has been partially supported by Grant No. DO 02-275, 16.12.2008, Bulgarian NSF, Ministry of Education and Science.
Resumo:
Communication has become an essential function in our civilization. With the increasing demand for communication channels, it is now necessary to find ways to optimize the use of their bandwidth. One way to achieve this is by transforming the information before it is transmitted. This transformation can be performed by several techniques. One of the newest of these techniques is the use of wavelets. Wavelet transformation refers to the act of breaking down a signal into components called details and trends by using small waveforms that have a zero average in the time domain. After this transformation the data can be compressed by discarding the details, transmitting the trends. In the receiving end, the trends are used to reconstruct the image. In this work, the wavelet used for the transformation of an image will be selected from a library of available bases. The accuracy of the reconstruction, after the details are discarded, is dependent on the wavelets chosen from the wavelet basis library. The system developed in this thesis takes a 2-D image and decomposes it using a wavelet bank. A digital signal processor is used to achieve near real-time performance in this transformation task. A contribution of this thesis project is the development of DSP-based test bed for the future development of new real-time wavelet transformation algorithms.
Resumo:
The basic objective of this work is to evaluate the durability of self-compacting concrete (SCC) produced in binary and ternary mixes using fly ash (FA) and limestone filler (LF) as partial replacement of cement. The main characteristics that set SCC apart from conventional concrete (fundamentally its fresh state behaviour) essentially depend on the greater or lesser content of various constituents, namely: greater mortar volume (more ultrafine material in the form of cement and mineral additions); proper control of the maximum size of the coarse aggregate; use of admixtures such as superplasticizers. Significant amounts of mineral additions are thus incorporated to partially replace cement, in order to improve the workability of the concrete. These mineral additions necessarily affect the concrete's microstructure and its durability. Therefore, notwithstanding the many well-documented and acknowledged advantages of SCC, a better understanding its behaviour is still required, in particular when its composition includes significant amounts of mineral additions. An ambitious working plan was devised: first, the SCC's microstructure was studied and characterized and afterwards the main transport and degradation mechanisms of the SCC produced were studied and characterized by means of SEM image analysis, chloride migration, electrical resistivity, and carbonation tests. It was then possible to draw conclusions about the SCC's durability. The properties studied are strongly affected by the type and content of the additions. Also, the use of ternary mixes proved to be extremely favourable, confirming the expected beneficial effect of the synergy between LF and FA.
Resumo:
We report on the construction of anatomically realistic three-dimensional in-silico breast phantoms with adjustable sizes, shapes and morphologic features. The concept of multiscale spatial resolution is implemented for generating breast tissue images from multiple modalities. Breast epidermal boundary and subcutaneous fat layer is generated by fitting an ellipsoid and 2nd degree polynomials to reconstructive surgical data and ultrasound imaging data. Intraglandular fat is simulated by randomly distributing and orienting adipose ellipsoids within a fibrous region immediately within the dermal layer. Cooper’s ligaments are simulated as fibrous ellipsoidal shells distributed within the subcutaneous fat layer. Individual ductal lobes are simulated following a random binary tree model which is generated based upon probabilistic branching conditions described by ramification matrices, as originally proposed by Bakic et al [3, 4]. The complete ductal structure of the breast is simulated from multiple lobes that extend from the base of the nipple and branch towards the chest wall. As lobe branching progresses, branches are reduced in height and radius and terminal branches are capped with spherical lobular clusters. Biophysical parameters are mapped onto the complete anatomical model and synthetic multimodal images (Mammography, Ultrasound, CT) are generated for phantoms of different adipose percentages (40%, 50%, 60%, and 70%) and are analytically compared with clinical examples. Results demonstrate that the in-silico breast phantom has applications in imaging performance evaluation and, specifically, great utility for solving image registration issues in multimodality imaging.
Resumo:
The goal of image retrieval and matching is to find and locate object instances in images from a large-scale image database. While visual features are abundant, how to combine them to improve performance by individual features remains a challenging task. In this work, we focus on leveraging multiple features for accurate and efficient image retrieval and matching. We first propose two graph-based approaches to rerank initially retrieved images for generic image retrieval. In the graph, vertices are images while edges are similarities between image pairs. Our first approach employs a mixture Markov model based on a random walk model on multiple graphs to fuse graphs. We introduce a probabilistic model to compute the importance of each feature for graph fusion under a naive Bayesian formulation, which requires statistics of similarities from a manually labeled dataset containing irrelevant images. To reduce human labeling, we further propose a fully unsupervised reranking algorithm based on a submodular objective function that can be efficiently optimized by greedy algorithm. By maximizing an information gain term over the graph, our submodular function favors a subset of database images that are similar to query images and resemble each other. The function also exploits the rank relationships of images from multiple ranked lists obtained by different features. We then study a more well-defined application, person re-identification, where the database contains labeled images of human bodies captured by multiple cameras. Re-identifications from multiple cameras are regarded as related tasks to exploit shared information. We apply a novel multi-task learning algorithm using both low level features and attributes. A low rank attribute embedding is joint learned within the multi-task learning formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered. To locate objects in images, we design an object detector based on object proposals and deep convolutional neural networks (CNN) in view of the emergence of deep networks. We improve a Fast RCNN framework and investigate two new strategies to detect objects accurately and efficiently: scale-dependent pooling (SDP) and cascaded rejection classifiers (CRC). The SDP improves detection accuracy by exploiting appropriate convolutional features depending on the scale of input object proposals. The CRC effectively utilizes convolutional features and greatly eliminates negative proposals in a cascaded manner, while maintaining a high recall for true objects. The two strategies together improve the detection accuracy and reduce the computational cost.
Resumo:
Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state of the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state of the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.
Resumo:
Universidade Estadual de Campinas . Faculdade de Educação Física
Resumo:
Universidade Estadual de Campinas . Faculdade de Educação Física
Resumo:
PURPOSE: Juvenile idiopathic arthritis (JIA) has unknown etiology, and the involvement of the temporomandibular joint (TMJ) is rare in the early phase of the disease. The present article describes the use of computed tomography (CT) and magnetic resonance (MRI) images for the diagnosis of affected TMJ in JIA. CASE DESCRIPTION: A 12-year-old, female, Caucasian patient, with systemic rheumathoid arthritis and involvement of multiple joints was referred to the Imaging Center for TMJ assessment. The patient reported TMJ pain and limited opening of the mouth. The helical CT examination of the TMJ region showed asymmetric mandibular condyles, erosion of the right condyle and osteophyte-like formation. The MRI examination showed erosion of the right mandibular condyle, osteophytes, displacement without reduction and disruption of the articular disc. CONCLUSION: The disorders of the TMJ as a consequence of JIA must be carefully assessed by modern imaging methods such as CT and MRI. CT is very useful for the evaluation of discrete bone changes, which are not identified by conventional radiographs in the early phase of JIA. MRI allows the evaluation of soft tissues, the identification of acute articular inflammation and the differentiation between pannus and synovial hypertrophy.
Resumo:
This paper deals with the emission of gravitational radiation in the context of a previously studied metric nonsymmetric theory of gravitation. The part coming from the symmetric part of the metric coincides with the mass quadrupole moment result of general relativity. The one associated to the antisymmetric part of the metric involves the dipole moment of the fermionic charge of the system. The results are applied to binary star systems and the decrease of the period of the elliptical motion is calculated.
Resumo:
A phase shift proximity printing lithographic mask is designed, manufactured and tested. Its design is based on a Fresnel computer-generated hologram, employing the scalar diffraction theory. The obtained amplitude and phase distributions were mapped into discrete levels. In addition, a coding scheme using sub-cells structure was employed in order to increase the number of discrete levels, thus increasing the degree of freedom in the resulting mask. The mask is fabricated on a fused silica substrate and an amorphous hydrogenated carbon (a:C-H) thin film which act as amplitude modulation agent. The lithographic image is projected onto a resist coated silicon wafer, placed at a distance of 50 mu m behind the mask. The results show a improvement of the achieved resolution - linewidth as good as 1.5 mu m - what is impossible to obtain with traditional binary masks in proximity printing mode. Such achieved dimensions can be used in the fabrication of MEMS and MOEMS devices. These results are obtained with a UV laser but also with a small arc lamp light source exploring the partial coherence of this source. (C) 2010 Optical Society of America