903 resultados para Audio-visual content classification
Resumo:
Humans have a high ability to extract visual data information acquired by sight. Trought a learning process, which starts at birth and continues throughout life, image interpretation becomes almost instinctively. At a glance, one can easily describe a scene with reasonable precision, naming its main components. Usually, this is done by extracting low-level features such as edges, shapes and textures, and associanting them to high level meanings. In this way, a semantic description of the scene is done. An example of this, is the human capacity to recognize and describe other people physical and behavioral characteristics, or biometrics. Soft-biometrics also represents inherent characteristics of human body and behaviour, but do not allow unique person identification. Computer vision area aims to develop methods capable of performing visual interpretation with performance similar to humans. This thesis aims to propose computer vison methods which allows high level information extraction from images in the form of soft biometrics. This problem is approached in two ways, unsupervised and supervised learning methods. The first seeks to group images via an automatic feature extraction learning , using both convolution techniques, evolutionary computing and clustering. In this approach employed images contains faces and people. Second approach employs convolutional neural networks, which have the ability to operate on raw images, learning both feature extraction and classification processes. Here, images are classified according to gender and clothes, divided into upper and lower parts of human body. First approach, when tested with different image datasets obtained an accuracy of approximately 80% for faces and non-faces and 70% for people and non-person. The second tested using images and videos, obtained an accuracy of about 70% for gender, 80% to the upper clothes and 90% to lower clothes. The results of these case studies, show that proposed methods are promising, allowing the realization of automatic high level information image annotation. This opens possibilities for development of applications in diverse areas such as content-based image and video search and automatica video survaillance, reducing human effort in the task of manual annotation and monitoring.
Resumo:
This thesis proposes a generic visual perception architecture for robotic clothes perception and manipulation. This proposed architecture is fully integrated with a stereo vision system and a dual-arm robot and is able to perform a number of autonomous laundering tasks. Clothes perception and manipulation is a novel research topic in robotics and has experienced rapid development in recent years. Compared to the task of perceiving and manipulating rigid objects, clothes perception and manipulation poses a greater challenge. This can be attributed to two reasons: firstly, deformable clothing requires precise (high-acuity) visual perception and dexterous manipulation; secondly, as clothing approximates a non-rigid 2-manifold in 3-space, that can adopt a quasi-infinite configuration space, the potential variability in the appearance of clothing items makes them difficult to understand, identify uniquely, and interact with by machine. From an applications perspective, and as part of EU CloPeMa project, the integrated visual perception architecture refines a pre-existing clothing manipulation pipeline by completing pre-wash clothes (category) sorting (using single-shot or interactive perception for garment categorisation and manipulation) and post-wash dual-arm flattening. To the best of the author’s knowledge, as investigated in this thesis, the autonomous clothing perception and manipulation solutions presented here were first proposed and reported by the author. All of the reported robot demonstrations in this work follow a perception-manipulation method- ology where visual and tactile feedback (in the form of surface wrinkledness captured by the high accuracy depth sensor i.e. CloPeMa stereo head or the predictive confidence modelled by Gaussian Processing) serve as the halting criteria in the flattening and sorting tasks, respectively. From scientific perspective, the proposed visual perception architecture addresses the above challenges by parsing and grouping 3D clothing configurations hierarchically from low-level curvatures, through mid-level surface shape representations (providing topological descriptions and 3D texture representations), to high-level semantic structures and statistical descriptions. A range of visual features such as Shape Index, Surface Topologies Analysis and Local Binary Patterns have been adapted within this work to parse clothing surfaces and textures and several novel features have been devised, including B-Spline Patches with Locality-Constrained Linear coding, and Topology Spatial Distance to describe and quantify generic landmarks (wrinkles and folds). The essence of this proposed architecture comprises 3D generic surface parsing and interpretation, which is critical to underpinning a number of laundering tasks and has the potential to be extended to other rigid and non-rigid object perception and manipulation tasks. The experimental results presented in this thesis demonstrate that: firstly, the proposed grasp- ing approach achieves on-average 84.7% accuracy; secondly, the proposed flattening approach is able to flatten towels, t-shirts and pants (shorts) within 9 iterations on-average; thirdly, the proposed clothes recognition pipeline can recognise clothes categories from highly wrinkled configurations and advances the state-of-the-art by 36% in terms of classification accuracy, achieving an 83.2% true-positive classification rate when discriminating between five categories of clothes; finally the Gaussian Process based interactive perception approach exhibits a substantial improvement over single-shot perception. Accordingly, this thesis has advanced the state-of-the-art of robot clothes perception and manipulation.
Resumo:
Dissertação de mest. em Psicologia da Educação - Especialização em Ensino Básico, Faculdade de Ciências Humanas e Sociais e Escola Superior de Educação, Univ. do Algarve, 2003
Resumo:
117 p.
Resumo:
In this article we describe a semantic localization dataset for indoor environments named ViDRILO. The dataset provides five sequences of frames acquired with a mobile robot in two similar office buildings under different lighting conditions. Each frame consists of a point cloud representation of the scene and a perspective image. The frames in the dataset are annotated with the semantic category of the scene, but also with the presence or absence of a list of predefined objects appearing in the scene. In addition to the frames and annotations, the dataset is distributed with a set of tools for its use in both place classification and object recognition tasks. The large number of labeled frames in conjunction with the annotation scheme make this dataset different from existing ones. The ViDRILO dataset is released for use as a benchmark for different problems such as multimodal place classification and object recognition, 3D reconstruction or point cloud data compression.
Resumo:
Ce travail présente deux nouveaux systèmes simples d'analyse de la marche humaine grâce à une caméra de profondeur (Microsoft Kinect) placée devant un sujet marchant sur un tapis roulant conventionnel, capables de détecter une marche saine et celle déficiente. Le premier système repose sur le fait qu'une marche normale présente typiquement un signal de profondeur lisse au niveau de chaque pixel avec moins de hautes fréquences, ce qui permet d'estimer une carte indiquant l'emplacement et l'amplitude de l'énergie de haute fréquence (HFSE). Le second système analyse les parties du corps qui ont un motif de mouvement irrégulier, en termes de périodicité, lors de la marche. Nous supposons que la marche d'un sujet sain présente partout dans le corps, pendant les cycles de marche, un signal de profondeur avec un motif périodique sans bruit. Nous estimons, à partir de la séquence vidéo de chaque sujet, une carte montrant les zones d'irrégularités de la marche (également appelées énergie de bruit apériodique). La carte avec HFSE ou celle visualisant l'énergie de bruit apériodique peut être utilisée comme un bon indicateur d'une éventuelle pathologie, dans un outil de diagnostic précoce, rapide et fiable, ou permettre de fournir des informations sur la présence et l'étendue de la maladie ou des problèmes (orthopédiques, musculaires ou neurologiques) du patient. Même si les cartes obtenues sont informatives et très discriminantes pour une classification visuelle directe, même pour un non-spécialiste, les systèmes proposés permettent de détecter automatiquement les individus en bonne santé et ceux avec des problèmes locomoteurs.
Resumo:
Ce travail présente deux nouveaux systèmes simples d'analyse de la marche humaine grâce à une caméra de profondeur (Microsoft Kinect) placée devant un sujet marchant sur un tapis roulant conventionnel, capables de détecter une marche saine et celle déficiente. Le premier système repose sur le fait qu'une marche normale présente typiquement un signal de profondeur lisse au niveau de chaque pixel avec moins de hautes fréquences, ce qui permet d'estimer une carte indiquant l'emplacement et l'amplitude de l'énergie de haute fréquence (HFSE). Le second système analyse les parties du corps qui ont un motif de mouvement irrégulier, en termes de périodicité, lors de la marche. Nous supposons que la marche d'un sujet sain présente partout dans le corps, pendant les cycles de marche, un signal de profondeur avec un motif périodique sans bruit. Nous estimons, à partir de la séquence vidéo de chaque sujet, une carte montrant les zones d'irrégularités de la marche (également appelées énergie de bruit apériodique). La carte avec HFSE ou celle visualisant l'énergie de bruit apériodique peut être utilisée comme un bon indicateur d'une éventuelle pathologie, dans un outil de diagnostic précoce, rapide et fiable, ou permettre de fournir des informations sur la présence et l'étendue de la maladie ou des problèmes (orthopédiques, musculaires ou neurologiques) du patient. Même si les cartes obtenues sont informatives et très discriminantes pour une classification visuelle directe, même pour un non-spécialiste, les systèmes proposés permettent de détecter automatiquement les individus en bonne santé et ceux avec des problèmes locomoteurs.
Resumo:
Children with chronic conditions often experience a long treatment which can be complex and negatively impacts the child's well-being. In planning treatment and interventions for children with chronic conditions, it is important to measure health-related quality of life (HrQoL). HrQoL instruments are considered to be a patient-reported outcome measure (PROM) and should be used in routine practice. Purpose: The aim of this study was to compare the content dimensions of HrQoL instruments for children's self-reports using the framework of ICF-CY. Method: The sample consist of six instruments for health-related quality of life for children 5 to 18 years of age, which was used in the Swedish national quality registries for children and adolescents with chronic conditions. The following instruments were included: CHQ-CF, DCGM-37, EQ-5D-Y, KIDSCREEN-52, Kid-KINDL and PedsQL 4.0. The framework of the ICF-CY was used as the basis for the comparison. Results: There were 290 meaningful concepts identified and linked to 88 categories in the classification ICF-CY with 29 categories of the component body functions, 48 categories of the component activities and participation and 11 categories of the component environmental factors. No concept were linked to the component body structures. The comparison revealed that the items in the HrQoL instruments corresponded primarily with the domains of activities and less with environmental factors. Conclusions: In conclusion, the results confirm that ICF-CY provide a good framework for content comparisons that evaluate similarities and differences to ICF-CY categories. The results of this study revealed the need for greater consensus of content across different HrQoL instruments. To obtain a detailed description of children's HrQoL, DCGM-37 and KIDSCREEN-52 may be appropriate instruments to use that can increase the understanding of young patients' needs.
Resumo:
Background: Impairments in social communication are the hallmark feature of autism spectrum disorder (ASD). Operationalizing ‘severity’ in ASD has been challenging; thus stratifying by functioning has not been possible. Purpose: To describe the development of the Autism Classification System of Functioning: Social Communication (ACSF:SC) and evaluate its consistency within and between parent and professional ratings. Methodology: (1)ACSF:SC development based on focus groups and surveys involving parents, educators and clinicians familiar with preschoolers with ASD; and (2)Evaluation of the intra- and inter-rater agreement of the ACSF:SC using weighted kappa(кw). Results: Seventy-six participants were involved in the development process. Core characteristics of social communication were ascertained: communicative intent; communicative skills and reciprocity; and impact of environment. Five ACSF:SC levels were created and content-validated across participants. Best capacity and typical performance agreement ratings varied as follows: intra-rater on 41 children was кw=0.61-0.69 for parents and кw=0.71-0.95 for professionals; inter-rater between professionals were кw=0.47-0.61 and between parents and professionals кw=0.33-0.53. Conclusions: Perspectives from parents, and professionals informed ACSF:SC development, providing common descriptions of the levels of everyday communicative abilities of children with ASD to complement DSM-5. Rater agreement demonstrates the ACSF:SC can be utilized with acceptable consistency in comparison to other functional classification systems.
Resumo:
Soils formed in high mountainous regions in southern Brazil are characterized by great accumulation of organic matter (OM) in the surface horizons and variation in the degree of development. We hypothesized that soil properties and genesis are influenced by the interaction of parent materials and climate factors, which differ depending on the location along the altitudinal gradient. The goal of this study was to characterize and classify the soil, evaluate soil distribution, and determine the interactive effects of soil-forming factors in the subtropical mountain regions in Santa Catarina state. Soil samples were collected in areas known for wine production, for a total of 38 modal profiles. Based on morphological, physical, and chemical properties, soils were evaluated for pedogenesis and classified according to the Brazilian System of Soil Classification, with equivalent classes in the World Reference Basis (WRB). The results indicated that pedogenesis was strongly influenced by the parent material, weather, and relief. In the areas where basic effusive rocks (basalt) were observed, there was formation of extensive areas of clayey soils with reddish color and higher iron oxide contents. There was a predominance of Nitossolos Vermelhos and Háplicos (Nitisols), Latossolos Vermelhos (Ferralsols), and Cambissolos Háplicos (Cambisols), highlighting the pedogenetic processes of eluviation, illuviation of clay, and latosolization in conditions of year-long, large-volume, well-distributed rainfall and stability of land forms. In areas with acid effusive rocks (rhyodacites), medial or clayey soils were observed with lower iron oxide content, invariably acidic, and with low base content. For these soils, relief promoted substantial removal of material, resulting in intense rejuvenation, with a predominance of Cambissolos Háplicos (Cambisols) and lesser occurrence of Nitossolos Brunos (Nitisols) and Neossolos Litólicos (Leptosols). Soils formed from sedimentary rocks also tended to be more acidic, but with higher sand content, and the soils identified were Cambissolos Háplicos and Húmicos (Cambisols). Cluster analysis separated the soil profiles into three groups: the first and largest was formed by profiles originating from sedimentary rocks and rhyodacites; the second, smaller group was formed by four profiles in the Água Doce region (acidic rocks); and the third was formed by profiles derived from basalt. Discriminant analysis was effective in grouping soil classes. Thus, the study highlighted the importance of geology in the formation of soils in this landscape associated with climate and relief.
Resumo:
Throughout the years, technology has had an undeniable impact on the AVT field. It has revolutionized the way audiovisual content is consumed by allowing audiences to easily access it at any time and on any device. Especially after the introduction of OTT streaming platforms such as Netflix, Amazon Prime Video, Disney+, Apple TV+, and HBO Max, which offer a vast catalog of national and international products, the consumption of audiovisual products has been on a constant rise and, consequently, the demand for localized content too. In turn, the AVT industry resorts to new technologies and practices to handle the ever-growing workload and the faster turnaround times. Due to the numerous implications that it has on the industry, technological advancement can be considered an area of research of particular interest for the AVT studies. However, in the case of dubbing, research and discussion regarding the topic is lagging behind because of the more limited impact that technology has had on the very conservative dubbing industry. Therefore, the aim of the dissertation is to offer an overview of some of the latest technological innovations and practices that have already been implemented (i.e. cloud dubbing and DeepDub technology) or that are still under development and research (i.e. automatic speech recognition and respeaking, machine translation and post-editing, audio-based and visual-based dubbing techniques, text-based editing of talking-head videos, and automatic dubbing), and respectively discuss their reception by the industry professionals, and make assumptions about their future implementation in the dubbing field.
Resumo:
With the advent of high-performance computing devices, deep neural networks have gained a lot of popularity in solving many Natural Language Processing tasks. However, they are also vulnerable to adversarial attacks, which are able to modify the input text in order to mislead the target model. Adversarial attacks are a serious threat to the security of deep neural networks, and they can be used to craft adversarial examples that steer the model towards a wrong decision. In this dissertation, we propose SynBA, a novel contextualized synonym-based adversarial attack for text classification. SynBA is based on the idea of replacing words in the input text with their synonyms, which are selected according to the context of the sentence. We show that SynBA successfully generates adversarial examples that are able to fool the target model with a high success rate. We demonstrate three advantages of this proposed approach: (1) effective - it outperforms state-of-the-art attacks by semantic similarity and perturbation rate, (2) utility-preserving - it preserves semantic content, grammaticality, and correct types classified by humans, and (3) efficient - it performs attacks faster than other methods.
Resumo:
A deficiência visual em crianças é uma questão de saúde pública, responsável por dificuldade de aprendizagem e por elevadas taxas de evasão escolar. Por ocasião da realização do diagnóstico situacional na área de abrangência e em função da implantação do Programa Saúde na Escola em Coronel Fabriciano, evidenciou-se que um dos principais problemas acompanhados pela Unidade Básica de Saúde (UBS) foi o elevado número de alterações visuais em crianças e adolescentes, principalmente aqueles que não usavam lentes corretivas. Portanto, o presente estudo tem como objetivo elaborar um plano de ação com o intuito de avaliar a acuidade visual e identificar crianças com alterações visuais da Escola Maria da Penha Lima, município de Coronel Fabriciano e encaminhá-las para o Oftalmologista do Projeto Olhar Brasil. Foi realizada pesquisa bibliográfica, utilizando-se as bases de dados Biblioteca Virtual de Saúde (BVS), Scientific Electronic Library Online (SciELO) e documentos oficiais do Ministério da Saúde. Os principais resultados esperados incluem a identificação e o encaminhamento de crianças com alterações visuais para tratamento adequado e maior vínculo entre a Estratégia de Saúde da Família com a escola, a criança e suas famílias.
Resumo:
Ochnaceae s.str. (Malpighiales) are a pantropical family of about 500 species and 27 genera of almost exclusively woody plants. Infrafamilial classification and relationships have been controversial partially due to the lack of a robust phylogenetic framework. Including all genera except Indosinia and Perissocarpa and DNA sequence data for five DNA regions (ITS, matK, ndhF, rbcL, trnL-F), we provide for the first time a nearly complete molecular phylogenetic analysis of Ochnaceae s.l. resolving most of the phylogenetic backbone of the family. Based on this, we present a new classification of Ochnaceae s.l., with Medusagynoideae and Quiinoideae included as subfamilies and the former subfamilies Ochnoideae and Sauvagesioideae recognized at the rank of tribe. Our data support a monophyletic Ochneae, but Sauvagesieae in the traditional circumscription is paraphyletic because Testulea emerges as sister to the rest of Ochnoideae, and the next clade shows Luxemburgia+Philacra as sister group to the remaining Ochnoideae. To avoid paraphyly, we classify Luxemburgieae and Testuleeae as new tribes. The African genus Lophira, which has switched between subfamilies (here tribes) in past classifications, emerges as sister to all other Ochneae. Thus, endosperm-free seeds and ovules with partly to completely united integuments (resulting in an apparently single integument) are characters that unite all members of that tribe. The relationships within its largest clade, Ochnineae (former Ochneae), are poorly resolved, but former Ochninae (Brackenridgea, Ochna) are polyphyletic. Within Sauvagesieae, the genus Sauvagesia in its broad circumscription is polyphyletic as Sauvagesia serrata is sister to a clade of Adenarake, Sauvagesia spp., and three other genera. Within Quiinoideae, in contrast to former phylogenetic hypotheses, Lacunaria and Touroulia form a clade that is sister to Quiina. Bayesian ancestral state reconstructions showed that zygomorphic flowers with adaptations to buzz-pollination (poricidal anthers), a syncarpous gynoecium (a near-apocarpous gynoecium evolved independently in Quiinoideae and Ochninae), numerous ovules, septicidal capsules, and winged seeds with endosperm are the ancestral condition in Ochnoideae. Although in some lineages poricidal anthers were lost secondarily, the evolution of poricidal superstructures secured the maintenance of buzz-pollination in some of these genera, indicating a strong selective pressure on keeping that specialized pollination system.
Resumo:
Pancreatic β-cells are highly sensitive to suboptimal or excess nutrients, as occurs in protein-malnutrition and obesity. Taurine (Tau) improves insulin secretion in response to nutrients and depolarizing agents. Here, we assessed the expression and function of Cav and KATP channels in islets from malnourished mice fed on a high-fat diet (HFD) and supplemented with Tau. Weaned mice received a normal (C) or a low-protein diet (R) for 6 weeks. Half of each group were fed a HFD for 8 weeks without (CH, RH) or with 5% Tau since weaning (CHT, RHT). Isolated islets from R mice showed lower insulin release with glucose and depolarizing stimuli. In CH islets, insulin secretion was increased and this was associated with enhanced KATP inhibition and Cav activity. RH islets secreted less insulin at high K(+) concentration and showed enhanced KATP activity. Tau supplementation normalized K(+)-induced secretion and enhanced glucose-induced Ca(2+) influx in RHT islets. R islets presented lower Ca(2+) influx in response to tolbutamide, and higher protein content and activity of the Kir6.2 subunit of the KATP. Tau increased the protein content of the α1.2 subunit of the Cav channels and the SNARE proteins SNAP-25 and Synt-1 in CHT islets, whereas in RHT, Kir6.2 and Synt-1 proteins were increased. In conclusion, impaired islet function in R islets is related to higher content and activity of the KATP channels. Tau treatment enhanced RHT islet secretory capacity by improving the protein expression and inhibition of the KATP channels and enhancing Synt-1 islet content.