903 resultados para Audio-visual content classification
Resumo:
Scene classification based on latent Dirichlet allocation (LDA) is a more general modeling method known as a bag of visual words, in which the construction of a visual vocabulary is a crucial quantization process to ensure success of the classification. A framework is developed using the following new aspects: Gaussian mixture clustering for the quantization process, the use of an integrated visual vocabulary (IVV), which is built as the union of all centroids obtained from the separate quantization process of each class, and the usage of some features, including edge orientation histogram, CIELab color moments, and gray-level co-occurrence matrix (GLCM). The experiments are conducted on IKONOS images with six semantic classes (tree, grassland, residential, commercial/industrial, road, and water). The results show that the use of an IVV increases the overall accuracy (OA) by 11 to 12% and 6% when it is implemented on the selected and all features, respectively. The selected features of CIELab color moments and GLCM provide a better OA than the implementation over CIELab color moment or GLCM as individuals. The latter increases the OA by only ∼2 to 3%. Moreover, the results show that the OA of LDA outperforms the OA of C4.5 and naive Bayes tree by ∼20%. © 2014 Society of Photo-Optical Instrumentation Engineers (SPIE) [DOI: 10.1117/1.JRS.8.083690]
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
The wide use of e-technologies represents a great opportunity for underserved segments of the population, especially with the aim of reintegrating excluded individuals back into society through education. This is particularly true for people with different types of disabilities who may have difficulties while attending traditional on-site learning programs that are typically based on printed learning resources. The creation and provision of accessible e-learning contents may therefore become a key factor in enabling people with different access needs to enjoy quality learning experiences and services. Another e-learning challenge is represented by m-learning (which stands for mobile learning), which is emerging as a consequence of mobile terminals diffusion and provides the opportunity to browse didactical materials everywhere, outside places that are traditionally devoted to education. Both such situations share the need to access materials in limited conditions and collide with the growing use of rich media in didactical contents, which are designed to be enjoyed without any restriction. Nowadays, Web-based teaching makes great use of multimedia technologies, ranging from Flash animations to prerecorded video-lectures. Rich media in e-learning can offer significant potential in enhancing the learning environment, through helping to increase access to education, enhance the learning experience and support multiple learning styles. Moreover, they can often be used to improve the structure of Web-based courses. These highly variegated and structured contents may significantly improve the quality and the effectiveness of educational activities for learners. For example, rich media contents allow us to describe complex concepts and process flows. Audio and video elements may be utilized to add a “human touch” to distance-learning courses. Finally, real lectures may be recorded and distributed to integrate or enrich on line materials. A confirmation of the advantages of these approaches can be seen in the exponential growth of video-lecture availability on the net, due to the ease of recording and delivering activities which take place in a traditional classroom. Furthermore, the wide use of assistive technologies for learners with disabilities injects new life into e-learning systems. E-learning allows distance and flexible educational activities, thus helping disabled learners to access resources which would otherwise present significant barriers for them. For instance, students with visual impairments have difficulties in reading traditional visual materials, deaf learners have trouble in following traditional (spoken) lectures, people with motion disabilities have problems in attending on-site programs. As already mentioned, the use of wireless technologies and pervasive computing may really enhance the educational learner experience by offering mobile e-learning services that can be accessed by handheld devices. This new paradigm of educational content distribution maximizes the benefits for learners since it enables users to overcome constraints imposed by the surrounding environment. While certainly helpful for users without disabilities, we believe that the use of newmobile technologies may also become a fundamental tool for impaired learners, since it frees them from sitting in front of a PC. In this way, educational activities can be enjoyed by all the users, without hindrance, thus increasing the social inclusion of non-typical learners. While the provision of fully accessible and portable video-lectures may be extremely useful for students, it is widely recognized that structuring and managing rich media contents for mobile learning services are complex and expensive tasks. Indeed, major difficulties originate from the basic need to provide a textual equivalent for each media resource composing a rich media Learning Object (LO). Moreover, tests need to be carried out to establish whether a given LO is fully accessible to all kinds of learners. Unfortunately, both these tasks are truly time-consuming processes, depending on the type of contents the teacher is writing and on the authoring tool he/she is using. Due to these difficulties, online LOs are often distributed as partially accessible or totally inaccessible content. Bearing this in mind, this thesis aims to discuss the key issues of a system we have developed to deliver accessible, customized or nomadic learning experiences to learners with different access needs and skills. To reduce the risk of excluding users with particular access capabilities, our system exploits Learning Objects (LOs) which are dynamically adapted and transcoded based on the specific needs of non-typical users and on the barriers that they can encounter in the environment. The basic idea is to dynamically adapt contents, by selecting them from a set of media resources packaged in SCORM-compliant LOs and stored in a self-adapting format. The system schedules and orchestrates a set of transcoding processes based on specific learner needs, so as to produce a customized LO that can be fully enjoyed by any (impaired or mobile) student.
Core networks for visual-concrete and abstract thought content: a brain electric microstate analysis
Resumo:
Commonality of activation of spontaneously forming and stimulus-induced mental representations is an often made but rarely tested assumption in neuroscience. In a conjunction analysis of two earlier studies, brain electric activity during visual-concrete and abstract thoughts was studied. The conditions were: in study 1, spontaneous stimulus-independent thinking (post-hoc, visual imagery or abstract thought were identified); in study 2, reading of single nouns ranking high or low on a visual imagery scale. In both studies, subjects' tasks were similar: when prompted, they had to recall the last thought (study 1) or the last word (study 2). In both studies, subjects had no instruction to classify or to visually imagine their thoughts, and accordingly were not aware of the studies' aim. Brain electric data were analyzed into functional topographic brain images (using LORETA) of the last microstate before the prompt (study 1) and of the word-type discriminating event-related microstate after word onset (study 2). Conjunction analysis across the two studies yielded commonality of activation of core networks for abstract thought content in left anterior superior regions, and for visual-concrete thought content in right temporal-posterior inferior regions. The results suggest that two different core networks are automatedly activated when abstract or visual-concrete information, respectively, enters working memory, without a subject task or instruction about the two classes of information, and regardless of internal or external origin, and of input modality. These core machineries of working memory thus are invariant to source or modality of input when treating the two types of information.
Resumo:
OBJECTIVE: To compare the content covered by twelve obesity-specific health status measures using the International Classification of Functioning, Disability and Health (ICF). DESIGN: Obesity-specific health status measures were identified and then linked to the ICF separately by two trained health professionals according to standardized guidelines. The degree of agreement between health professionals was calculated by means of the kappa (kappa) statistic. Bootstrapped confidence intervals (CI) were calculated. The obesity-specific health-status measures were compared on the component and category level of the ICF. MEASUREMENTS: welve condition-specific health-status measures were identified and included in this study, namely the obesity-related problem scale, the obesity eating problems scale, the obesity-related coping and obesity-related distress questionnaire, the impact of weight on quality of life questionnaire (short version), the health-related quality of life questionnaire, the obesity adjustment survey (short form), the short specific quality of life scale, the obesity-related well-being questionnaire, the bariatric analysis and reporting outcome system, the bariatric quality of life index, the obesity and weight loss quality of life questionnaire and the weight-related symptom measure. RESULTS: In the 280 items of the eight measures, a total of 413 concepts were identified and linked to the 87 different ICF categories. The measures varied strongly in the number of concepts contained and the number of ICF categories used to map these concepts. Items on body functions varied form 12% in the obesity-related problem scale to 95% in the weight-related symptom measure. The estimated kappa coefficients ranged between 0.79 (CI: 0.72, 0.86) at the component ICFs level and 0.97 (CI: 0.93, 1.0) at the third ICF's level. CONCLUSION: The ICF proved highly useful for the content comparison of obesity-specific health-status measures. The results may provide clinicians and researchers with new insights when selecting health-status measures for clinical studies in obesity.
Resumo:
In this paper, we present a novel coarse-to-fine visual localization approach: contextual visual localization. This approach relies on three elements: (i) a minimal-complexity classifier for performing fast coarse localization (submap classification); (ii) an optimized saliency detector which exploits the visual statistics of the submap; and (iii) a fast view-matching algorithm which filters initial matchings with a structural criterion. The latter algorithm yields fine localization. Our experiments show that these elements have been successfully integrated for solving the global localization problem. Context, that is, the awareness of being in a particular submap, is defined by a supervised classifier tuned for a minimal set of features. Visual context is exploited both for tuning (optimizing) the saliency detection process, and to select potential matching views in the visual database, close enough to the query view.
Resumo:
Visual mental imagery is a complex process that may be influenced by the content of mental images. Neuropsychological evidence from patients with hemineglect suggests that in the imagery domain environments and objects may be represented separately and may be selectively affected by brain lesions. In the present study, we used functional magnetic resonance imaging (fMRI) to assess the possibility of neural segregation among mental images depicting parts of an object, of an environment (imagined from a first-person perspective), and of a geographical map, using both a mass univariate and a multivariate approach. Data show that different brain areas are involved in different types of mental images. Imagining an environment relies mainly on regions known to be involved in navigational skills, such as the retrosplenial complex and parahippocampal gyrus, whereas imagining a geographical map mainly requires activation of the left angular gyrus, known to be involved in the representation of categorical relations. Imagining a familiar object mainly requires activation of parietal areas involved in visual space analysis in both the imagery and the perceptual domain. We also found that the pattern of activity in most of these areas specifically codes for the spatial arrangement of the parts of the mental image. Our results clearly demonstrate a functional neural segregation for different contents of mental images and suggest that visuospatial information is coded by different patterns of activity in brain areas involved in visual mental imagery. Hum Brain Mapp 36:945-958, 2015.
Classification of Paintings by Artist, Movement, and Indoor Setting Using MPEG-7 Descriptor Features
Resumo:
ACM Computing Classification System (1998): I.4.9, I.4.10.
Resumo:
The abundance of visual data and the push for robust AI are driving the need for automated visual sensemaking. Computer Vision (CV) faces growing demand for models that can discern not only what images "represent," but also what they "evoke." This is a demand for tools mimicking human perception at a high semantic level, categorizing images based on concepts like freedom, danger, or safety. However, automating this process is challenging due to entropy, scarcity, subjectivity, and ethical considerations. These challenges not only impact performance but also underscore the critical need for interoperability. This dissertation focuses on abstract concept-based (AC) image classification, guided by three technical principles: situated grounding, performance enhancement, and interpretability. We introduce ART-stract, a novel dataset of cultural images annotated with ACs, serving as the foundation for a series of experiments across four key domains: assessing the effectiveness of the end-to-end DL paradigm, exploring cognitive-inspired semantic intermediaries, incorporating cultural and commonsense aspects, and neuro-symbolic integration of sensory-perceptual data with cognitive-based knowledge. Our results demonstrate that integrating CV approaches with semantic technologies yields methods that surpass the current state of the art in AC image classification, outperforming the end-to-end deep vision paradigm. The results emphasize the role semantic technologies can play in developing both effective and interpretable systems, through the capturing, situating, and reasoning over knowledge related to visual data. Furthermore, this dissertation explores the complex interplay between technical and socio-technical factors. By merging technical expertise with an understanding of human and societal aspects, we advocate for responsible labeling and training practices in visual media. These insights and techniques not only advance efforts in CV and explainable artificial intelligence but also propel us toward an era of AI development that harmonizes technical prowess with deep awareness of its human and societal implications.
Resumo:
to assess the construct validity and reliability of the Pediatric Patient Classification Instrument. correlation study developed at a teaching hospital. The classification involved 227 patients, using the pediatric patient classification instrument. The construct validity was assessed through the factor analysis approach and reliability through internal consistency. the Exploratory Factor Analysis identified three constructs with 67.5% of variance explanation and, in the reliability assessment, the following Cronbach's alpha coefficients were found: 0.92 for the instrument as a whole; 0.88 for the Patient domain; 0.81 for the Family domain; 0.44 for the Therapeutic procedures domain. the instrument evidenced its construct validity and reliability, and these analyses indicate the feasibility of the instrument. The validation of the Pediatric Patient Classification Instrument still represents a challenge, due to its relevance for a closer look at pediatric nursing care and management. Further research should be considered to explore its dimensionality and content validity.
Resumo:
Improve the content validity of the instrument for classification of pediatric patients and evaluate its construct validity. A descriptive exploratory study in the measurement of the content validity index, and correlational design for construct validation through exploratory factor analysis. The content validity index for indicators was 0.99 and it was 0.97 for graded situations. Three domains were extracted in the construct validation, namely: patient, family and therapeutic procedures, with 74.97% of explained variance. The instrument showed evidences of content and construct validity. The validation of the instrument occurred under the approach of family-centered care, and allowed incorporating some essential needs of childhood such as playing, interaction and affection in the content of the instrument.
Resumo:
A case of neuronal ceroid-lipofuscinosis (NCL) is reported in a 11-year-old girl, whose main symptoms were progressive dementia since the age of 4 years and choreic movements since age 10. Seizures, myoclonus and visual deterioration were absent and optic fundi were normal. A cerebral biopsy disclosed two basic types of stored substance in the cytoplasm of neurons: a) severely balloned nerve cells in cortical layers HI and V contained a non-autofluorescent material, which stained with PAS and Sudan Black B in frozen, but not in paraffin sections; ultrastructurally, these neurons showed abundant corpuscles similar to the membranous cytoplasmic bodies of Tay-Sachs disease and, in smaller amounts, also zebra bodies; b) slightly distended or non-distended neurons in all layers contained lipopigment granules, which were autofluorescent, PAS-positive and sudanophil in both frozen and paraffin sections; their ultrastructure was closely comparable to that of lipofuscin. Similar bodies were found in the swollen segments of axons and in a few astrocytes and endothelial cells. The histochemical and ultrastructural demonstration of large amounts of lipopigments allows a presumptive classification of the case as NCL. However, the presence of involuntary movements, the absence of visual disturbances and the unusual ultrastructural features place the patient into a small heterogeneous group within the NCL. A better classification of such unique instances of the disease must await elucidation of the basic enzymatic defects.
Resumo:
Universidade Estadual de Campinas . Faculdade de Educação Física