11 resultados para audio segmentation
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
This paper presents an optimum user-steered boundary tracking approach for image segmentation, which simulates the behavior of water flowing through a riverbed. The riverbed approach was devised using the image foresting transform with a never-exploited connectivity function. We analyze its properties in the derived image graphs and discuss its theoretical relation with other popular methods such as live wire and graph cuts. Several experiments show that riverbed can significantly reduce the number of user interactions (anchor points), as compared to live wire for objects with complex shapes. This paper also includes a discussion about how to combine different methods in order to take advantage of their complementary strengths.
Resumo:
A deep theoretical analysis of the graph cut image segmentation framework presented in this paper simultaneously translates into important contributions in several directions. The most important practical contribution of this work is a full theoretical description, and implementation, of a novel powerful segmentation algorithm, GC(max). The output of GC(max) coincides with a version of a segmentation algorithm known as Iterative Relative Fuzzy Connectedness, IRFC. However, GC(max) is considerably faster than the classic IRFC algorithm, which we prove theoretically and show experimentally. Specifically, we prove that, in the worst case scenario, the GC(max) algorithm runs in linear time with respect to the variable M=|C|+|Z|, where |C| is the image scene size and |Z| is the size of the allowable range, Z, of the associated weight/affinity function. For most implementations, Z is identical to the set of allowable image intensity values, and its size can be treated as small with respect to |C|, meaning that O(M)=O(|C|). In such a situation, GC(max) runs in linear time with respect to the image size |C|. We show that the output of GC(max) constitutes a solution of a graph cut energy minimization problem, in which the energy is defined as the a"" (a) norm ayenF (P) ayen(a) of the map F (P) that associates, with every element e from the boundary of an object P, its weight w(e). This formulation brings IRFC algorithms to the realm of the graph cut energy minimizers, with energy functions ayenF (P) ayen (q) for qa[1,a]. Of these, the best known minimization problem is for the energy ayenF (P) ayen(1), which is solved by the classic min-cut/max-flow algorithm, referred to often as the Graph Cut algorithm. We notice that a minimization problem for ayenF (P) ayen (q) , qa[1,a), is identical to that for ayenF (P) ayen(1), when the original weight function w is replaced by w (q) . Thus, any algorithm GC(sum) solving the ayenF (P) ayen(1) minimization problem, solves also one for ayenF (P) ayen (q) with qa[1,a), so just two algorithms, GC(sum) and GC(max), are enough to solve all ayenF (P) ayen (q) -minimization problems. We also show that, for any fixed weight assignment, the solutions of the ayenF (P) ayen (q) -minimization problems converge to a solution of the ayenF (P) ayen(a)-minimization problem (ayenF (P) ayen(a)=lim (q -> a)ayenF (P) ayen (q) is not enough to deduce that). An experimental comparison of the performance of GC(max) and GC(sum) algorithms is included. This concentrates on comparing the actual (as opposed to provable worst scenario) algorithms' running time, as well as the influence of the choice of the seeds on the output.
Resumo:
Bilayer segmentation of live video in uncontrolled environments is an essential task for home applications in which the original background of the scene must be replaced, as in videochats or traditional videoconference. The main challenge in such conditions is overcome all difficulties in problem-situations (e. g., illumination change, distract events such as element moving in the background and camera shake) that may occur while the video is being captured. This paper presents a survey of segmentation methods for background substitution applications, describes the main concepts and identifies events that may cause errors. Our analysis shows that although robust methods rely on specific devices (multiple cameras or sensors to generate depth maps) which aid the process. In order to achieve the same results using conventional devices (monocular video cameras), most current research relies on energy minimization frameworks, in which temporal and spacial information are probabilistically combined with those of color and contrast.
Resumo:
Abstract Background Atherosclerosis causes millions of deaths, annually yielding billions in expenses round the world. Intravascular Optical Coherence Tomography (IVOCT) is a medical imaging modality, which displays high resolution images of coronary cross-section. Nonetheless, quantitative information can only be obtained with segmentation; consequently, more adequate diagnostics, therapies and interventions can be provided. Since it is a relatively new modality, many different segmentation methods, available in the literature for other modalities, could be successfully applied to IVOCT images, improving accuracies and uses. Method An automatic lumen segmentation approach, based on Wavelet Transform and Mathematical Morphology, is presented. The methodology is divided into three main parts. First, the preprocessing stage attenuates and enhances undesirable and important information, respectively. Second, in the feature extraction block, wavelet is associated with an adapted version of Otsu threshold; hence, tissue information is discriminated and binarized. Finally, binary morphological reconstruction improves the binary information and constructs the binary lumen object. Results The evaluation was carried out by segmenting 290 challenging images from human and pig coronaries, and rabbit iliac arteries; the outcomes were compared with the gold standards made by experts. The resultant accuracy was obtained: True Positive (%) = 99.29 ± 2.96, False Positive (%) = 3.69 ± 2.88, False Negative (%) = 0.71 ± 2.96, Max False Positive Distance (mm) = 0.1 ± 0.07, Max False Negative Distance (mm) = 0.06 ± 0.1. Conclusions In conclusion, by segmenting a number of IVOCT images with various features, the proposed technique showed to be robust and more accurate than published studies; in addition, the method is completely automatic, providing a new tool for IVOCT segmentation.
Resumo:
O implante coclear (IC) tem sido indicado para crianças deficientes auditivas de grau severo e/ou profundo que não tem benefício com o aparelho de amplificação sonora individual (AASI), e que apresentem família adequada e motivada para o uso do dispositivo, bem como condições adequadas de reabilitação na cidade de origem. Atualmente, a procura pelo IC também ocorre por pais surdos, fluentes na Língua Brasileira de Sinais (LIBRAS), que recorrem a este tratamento para oferecer outra realidade para seus filhos. O ambiente destas crianças é bilíngue, dado pela LIBRAS dos pais e pela linguagem oral dos familiares próximos, do fonoaudiólogo e da escola. Neste sentido, o presente estudo visou acompanhar quatro crianças deficientes auditivas implantadas, sendo duas crianças filhas de pais deficientes auditivos fluentes na LIBRAS (expostas a ambiente bilíngue) e duas crianças filhas de pais sem alterações auditivas (expostas a ambiente oral). Para tanto, as habilidades de audição e de aquisição da linguagem oral foram comparadas nas quatro crianças implantadas. Foi possível observar que as quatro crianças apresentaram habilidades auditivas e de linguagem semelhantes ao longo do primeiro ano de uso do IC. Contudo, a partir disto, as crianças inseridas em ambiente bilíngue apresentaram melhor desempenho auditivo e linguístico, comparado ao desenvolvimento das outras crianças. As crianças inseridas em ambiente bilíngue podem se beneficiar do IC, desenvolvendo habilidades auditivas e de linguagem similares às das crianças inseridas em ambiente oral. Ressalta-se que os benefícios do dispositivo são obtidos a partir de aspectos multifatoriais, e estudos mais aprofundados são necessários.
Resumo:
OBJECTIVE: To propose an automatic brain tumor segmentation system. METHODS: The system used texture characteristics as its main source of information for segmentation. RESULTS: The mean correct match was 94% of correspondence between the segmented areas and ground truth. CONCLUSION: Final results showed that the proposed system was able to find and delimit tumor areas without requiring any user interaction.
Resumo:
The parenchymal distribution of the splenic artery was studied in order to obtain anatomical basis for partial splenectomy. Thirty two spleens were studied, 26 spleens of healthy horses weighing 320 to 450kg, aged 3 to 12 years and 6 spleens of fetus removed from slaughterhouse. The spleens were submitted to arteriography and scintigraphy in order to have their vascular pattern examined and compared to the external aspect of the organ aiming establish anatomo-surgical segments. All radiographs were photographed with a digital camera and the digital images were submitted to a measuring system for comparative analysis of areas of dorsal and ventral anatomo-surgical segments. Anatomical investigations into the angioarchitecture of the equine spleen showed a paucivascular area, which coincides with a thinner external area, allowing the organ to be divided in two anatomo-surgical segments of approximately 50% of the organ each.
Resumo:
Introdução: Síndrome de Ablefaro MAcrostomia (AMS) é uma condição rara que compreende pálpebras ausentes ou curto, orelhas anormais, macrostomia, genitália anômalo, pele redundante e cabelos ausente. Brancati et al (2004) relataram uma ocorrência estimada de perda auditiva em 70% desta população. Estudos específicos sobre a audição em AMS não estavam presentes nos jornais que compilados. Relato dos casos: Paciente 1 é o primeiro filho de um ano de idade, a mãe 23 anos e pai de 25 anos de idade, não consangüíneos. Suas características clínicas são pouco cabelo no couro cabeludo, orelhas em forma de taça, raiz nasal larga, narinas antevertidas, macrostomia, os dedos com membranas, pele redundante e hipoplasia mamilos e lábios. Ela não tem atraso no desenvolvimento neuropsicomotor e a fala é normal. A audição foi avaliada aos 15 anos com uma perda auditiva em 6 kHz. Paciente 2 é o terceiro filho do mesmo casal. Ela tem falta grave das pálpebras, uma ponte nasal baixa com narinas hipoplásica e anteversão, macrostomia, orelhas anormalmente modelados, a ausência de mamilos, um de 6 cm onfalocele, ânus anteriormente localizado, hipoplasia dos grandes lábios, unhas hipoplasia, atenuação distal de falanges e pele redundante. Ela está se desenvolvendo com atraso no desenvolvimento neuropsicomotor, fala normal e perda auditiva condutiva leve bilateral. A avaliação audiológica incluiu quatro procedimentos: história clínica audiológica, inspeção otológica, imitanciometria, audiometria tonal e discurso. Conclusões: Os pacientes estudados com AMS apresentaram perda auditiva leve e esta perda de audição pode ser considerado como uma parte do fenótipo AMS sendo compatível os achados com a literatura.
Resumo:
Recently there has been a considerable interest in dynamic textures due to the explosive growth of multimedia databases. In addition, dynamic texture appears in a wide range of videos, which makes it very important in applications concerning to model physical phenomena. Thus, dynamic textures have emerged as a new field of investigation that extends the static or spatial textures to the spatio-temporal domain. In this paper, we propose a novel approach for dynamic texture segmentation based on automata theory and k-means algorithm. In this approach, a feature vector is extracted for each pixel by applying deterministic partially self-avoiding walks on three orthogonal planes of the video. Then, these feature vectors are clustered by the well-known k-means algorithm. Although the k-means algorithm has shown interesting results, it only ensures its convergence to a local minimum, which affects the final result of segmentation. In order to overcome this drawback, we compare six methods of initialization of the k-means. The experimental results have demonstrated the effectiveness of our proposed approach compared to the state-of-the-art segmentation methods.
Resumo:
Dynamic texture is a recent field of investigation that has received growing attention from computer vision community in the last years. These patterns are moving texture in which the concept of selfsimilarity for static textures is extended to the spatiotemporal domain. In this paper, we propose a novel approach for dynamic texture representation, that can be used for both texture analysis and segmentation. In this method, deterministic partially self-avoiding walks are performed in three orthogonal planes of the video in order to combine appearance and motion features. We validate our method on three applications of dynamic texture that present interesting challenges: recognition, clustering and segmentation. Experimental results on these applications indicate that the proposed method improves the dynamic texture representation compared to the state of the art.