904 resultados para audio-visual automatic speech recognition
Resumo:
Os sistemas de perceção existentes nos robôs autónomos, hoje em dia, são bastante complexos. A informação dos vários sensores, existentes em diferentes partes do robôs, necessitam de estar relacionados entre si face ao referencial do robô ou do mundo. Para isso, o conhecimento da atitude (posição e rotação) entre os referenciais dos sensores e o referencial do robô é um fator critico para o desempenho do mesmo. O processo de calibração dessas posições e translações é chamado calibração dos parâmetros extrínsecos. Esta dissertação propõe o desenvolvimento de um método de calibração autónomo para robôs como câmaras direcionais, como é o caso dos robôs da equipa ISePorto. A solução proposta consiste na aquisição de dados da visão, giroscópio e odometria durante uma manobra efetuada pelo robô em torno de um alvo com um padrão conhecido. Esta informação é então processada em conjunto através de um Extended Kalman Filter (EKF) onde são estimados necessários para relacionar os sensores existentes no robô em relação ao referencial do mesmo. Esta solução foi avaliada com recurso a vários testes e os resultados obtidos foram bastante similares aos obtidos pelo método manual, anteriormente utilizado, com um aumento significativo em rapidez e consistência.
Resumo:
RESUMO: Na parte inicial incluem-se algumas notas sucintas com base no panorama científico,histórico e cultural da visão considerada segundo três abordagens - o olho (o olho humano na especificidade da sua posição filogenética, elemento anátomo-funcional básico do sistema visual ao qual o cérebro pertence), os olhos (unidades gémeas essenciais do rosto na sua actividade consensual e conjugada da binocularidade), o olhar (carregado de expressão psicológica e o seu efeito sobre o observador, sinal para o comportamento e criador de sentimentos, sedimentado em obras de arte e em formas de superstição dos povos). Segue-se a apresentação de um estudo descritivo transversal, como contribuição para o conhecimento do estado de saúde visual da população infantil da região de Lisboa e determinar factores que o influenciam. Entre Outubro de 2005 e Agosto de 2006 examinaram-se 649 crianças com idade inferior a 10 anos da Consulta de Oftalmologia Pediátrica dos Serviços de Assistência Médico-Social do Sindicato dos Bancários do Sul e Ilhas (SAMS). Colheram-se dados respeitantes a mais de 250 variáveis primárias que cobriram a maior parte dos itens do exame oftalmológico habitual. Na análise dos dados teve-se especialmente em conta a idade, com um papel decisivo nas principais fases de desenvolvimento do sistema visual. No caso das crianças de 6 a 7 anos de idade põem-se lado a lado resultados dos SAMS e das Escolas. A profusão de dados numéricos ditou a necessidade da determinação frequente da significância estatística dos resultados de subgrupos. Alguns resultados do estudo, na sua maioria do grupo SAMS: Crianças de 6-7 anos, 71,1% (SAMS) e 91,5% (Escolas) não tinham sido examinadas com menos de 4 anos. Frequência global de alterações miópicas 9,4%, de alterações hipermetrópicas 25,3%, umas e outras com variações acentuadas com a idade. Estrabismo convergente 3,9%. Ambliopia 2,6% (13/491 crianças >=4 anos de idade), mais frequente no sexo feminino, naquelas que tiveram a sua 1ª observação depois dos 4 anos e em que os pais não aderiam à terapêutica prescrita. Objectivos específicos ocuparam-se da acuidade visual e da refracção ocular. O estudo comparativo da refractometria automática sem e com cicloplegia permitiu evidenciar que o teste da acuidade visual é insuficiente, por si só, para fazer o diagnóstico correcto. A análise dos antecedentes familiares oftalmológicos demonstrou a importância do seu conhecimento e pôs em evidência, entre outras, as seguintes relações: 10 pag1.qxp 27-11-2001 18:28 Page 10 Índice Geral 11 Crianças com antecedentes de alterações miópicas têm maior frequência de diagnóstico de alterações miópicas e de refracção negativa, uma taxa mais elevada de correspondência quantitativa diagnóstico/refracção nas alterações miópicas. Estas crianças também têm, em geral, características inversas no que diz respeito a alterações hipermetrópicas. Crianças com antecedentes de alterações hipermetrópicas têm maior frequência de diagnóstico de alterações hipermetrópicas. Crianças com antecedentes de estrabismo têm maior frequência de diagnóstico de estrabismo convergente manifesto e de esodesvios no seu todo. Crianças com antecedentes familiares de astigmatismo têm maior frequência de diagnóstico de astigmatismo. Traçam-se alguns perfis oftalmológicos infantis que permitem apreciar de forma sinóptica um conjunto de parâmetros da saúde da visão. Os dados colhidos sobre a aderência dos pais à terapêutica prescrita e sobre a atitude em relação ao uso de óculos assim como os dados sobre o comportamento da criança na sala de aula e dificuldades de aprendizagem foram em geral escassos para permitirem tirar conclusões, embora mostrem indícios a investigar futuramente. Paralelamente ortoptistas e enfermeiras efectuaram um rastreio escolar da acuidade visual <0,8 e de alterações da motilidade ocular extrínseca que abrangeu 520 alunos do 1º ano do 1º ciclo do ensino básico (2005/2006) das escolas públicas da cidade de Lisboa. 101 destas crianças foram observadas no consultório da autora, umas referidas a partir do rastreio, outras como controlo deste. Quanto à acuidade visual o valor preditivo do teste negativo foi de 91% mas o do teste positivo de apenas 67% (33% de falsos positivos, consequentemente uma alta taxa de sobrerreferenciação). A qualidade do rastreio efectuado por ortoptistas foi inferior à do efectuado por enfermeiras. O rastreio não teve qualidade aceitável. Foi feito um inquérito a médicos e enfermeiros de centros de saúde sobre conhecimentos, atitudes e práticas em relação com os cuidados de oftalmologia pediátrica. Discutem-se os resultados, tiram-se conclusões e fazem-se recomendações susceptíveis de contribuir para uma melhor saúde visual das crianças. ABSTRACT: Firstly some brief remarks are made based on the scientific, historical and cultural panorama of the human vision with regard to three approaches: the eye (the human eye in its specific filogenetic place, fundamental anatomofunctional element of the visual system in interaction with the brain), the eyes (essential twin units of the face with their consensual and conjugated binocular activity), the gaze (psychologicaly overloaded, a means to express oneself and to influence the observer, a guide to other persons' behaviour, consolidated in works of art and in people's traditional superstitious believes and ways of thinking). A report is made on a cross-sectional descriptive study whose goal is to contribute to the knowledge of the level of visual health of children in the Lisbon Region and to identify factors which determine it. Between October 2005 and August 2006 649 children under 10 years were observed at the pediatric ophthalmologic consultation in the SAMS (Serviços de Assistência Médico-Social do Sindicato dos Bancários do Sul e Ilhas). Data were collected concerning more than 250 primary variables covering most itens of the usual ophthalmological examination. Special attention was paid to children's age since it plays a crucial role in main stages of visual system development. In the case of children age 6 to 7 SAMS and school results are often put side by side. On account of the great number of numerical data it was often necessary to look at the degree of statistical significancy of differencies between subgroups. Some of the study's results (mostly SAMS): Children age 6 to 7 - 71,1% (SAMS) and 91,5% (Schools) had not an ophthalmologic examination before 4 years old. Total frequency of myopic disorders 9,4%, of hypermetropic disorders 25,3%, both showing great differences between age groups; convergent strabismus 3,9%; amblyopia 2,6% (13/491 children over 3 years old), more frequent among little girls, in those with 1st examination after 4 years old and in those whose parents didn´t complied to the therapy ordered for the child. Specific objectives dealt with visual acuity and ocular refraction. The comparison of automatic refractometry without and with cycloplegy showed that visual acuity testing is often not enough for a correct diagnosis. Eye disorders in the family history proved to be a very important information. Analysis of corresponding data disclosed a lot of relationships among others: 12 pag1.qxp 27-11-2001 18:28 Page 12 Índice Geral 13 Children with a family history of myopic disorders have more frequently a diagnosis of myopic disorders and a negative refraction, a higher rate of quantitative diagnosis/refraction matching concerning myopic disorders. Those children have in general inverse characteristics regarding hypermetropic disorders. Children with a family history of hypermetropic disorders have more frequently a diagnosis of hypermetropic disorders. Children with a family history of strabismus have more frequently a diagnosis of manifest convergent strabismus and all forms of esodeviations. Children with a family history of astigmatism have more frequently a diagnosis of astigmatism. Ophthalmologic profiles are drawn allowing to take into account in a synoptic way a set of visual health parameters. Data on parents' compliance with therapy ordered for the child, and attitudes regarding child's glass wearing, as well as data on child's behaviour in the classroom and learning difficulties were as a rule too few to allow conclusions but still need more studies in the future. Orthoptists and nurses performed in the same study period a screening of visual acuity <0,8 and of ocular motility disorders addressed to children of 1srt degree of public schools (term 2005/2006) in the town of Lisbon. 520 of such children were screened. 101 of them were examined by the author in her medical office; some were refered, the others taken as a control. Regarding visual acuity the predictive value of a negative test was 91% but the predictive value of a positive test was only 67% (33% of false positive results, consequently a too high rate of overreferal). Performed by orthoptists screening quality was inferior in comparison with screening done by nurses. On the whole this screening had not the required quality. A survey on physicians' and nurses' knowledge, attitudes and practices related to pediatric ophthalmologic care was carried out in health centers. Results are discussed, conclusions drawn. Some suggestions are made aiming at a better children's visual health.
Resumo:
No decorrer dos últimos anos tem-se verificado um acréscimo do número de sistemas de videovigilância presentes nos mais diversos ambientes, sendo que estes se encontram cada vez mais sofisticados. Os casinos são um exemplo bastante popular da utilização destes sistemas sofisticados, sendo que vários casinos, hoje em dia, utilizam câmeras para controlo automático das suas operações de jogo. No entanto, atualmente existem vários tipos de jogos em que o controlo automático ainda não se encontra disponível, sendo um destes, o jogo Banca Francesa. A presente dissertação tem como objetivo propor um conjunto de algoritmos idealizados para um sistema de controlo e gestão do jogo de casino Banca Francesa através do auxílio de componentes pertencentes à área da computação visual, tendo em conta os contributos mais relevantes e existentes na área, elaborados por investigadores e entidades relacionadas. No decorrer desta dissertação são apresentados quatro módulos distintos, os quais têm como objetivo auxiliar os casinos a prevenir o acontecimento de fraudes durante o decorrer das suas operações, assim como auxiliar na recolha automática de resultados de jogo. Os quatro módulos apresentados são os seguintes: Dice Sample Generator – Módulo proposto para criação de casos de teste em grande escala; Dice Sample Analyzer – Módulo proposto para a deteção de resultados de jogo; Dice Calibration – Módulo proposto para calibração automática do sistema; Motion Detection – Módulo proposto para a deteção de fraude no jogo. Por fim, para cada um dos módulos, é apresentado um conjunto de testes e análises de modo a verificar se é possível provar o conceito para cada uma das propostas apresentadas.
Resumo:
As the wireless cellular market reaches competitive levels never seen before, network operators need to focus on maintaining Quality of Service (QoS) a main priority if they wish to attract new subscribers while keeping existing customers satisfied. Speech Quality as perceived by the end user is one major example of a characteristic in constant need of maintenance and improvement. It is in this topic that this Master Thesis project fits in. Making use of an intrusive method of speech quality evaluation, as a means to further study and characterize the performance of speech codecs in second-generation (2G) and third-generation (3G) technologies. Trying to find further correlation between codecs with similar bit rates, along with the exploration of certain transmission parameters which may aid in the assessment of speech quality. Due to some limitations concerning the audio analyzer equipment that was to be employed, a different system for recording the test samples was sought out. Although the new designed system is not standard, after extensive testing and optimization of the system's parameters, final results were found reliable and satisfactory. Tests include a set of high and low bit rate codecs for both 2G and 3G, where values were compared and analysed, leading to the outcome that 3G speech codecs perform better, under the approximately same conditions, when compared with 2G. Reinforcing the idea that 3G is, with no doubt, the best choice if the costumer looks for the best possible listening speech quality. Regarding the transmission parameters chosen for the experiment, the Receiver Quality (RxQual) and Received Energy per Chip to the Power Density Ratio (Ec/N0), these were subject to speech quality correlation tests. Final results of RxQual were compared to those of prior studies from different researchers and, are considered to be of important relevance. Leading to the confirmation of RxQual as a reliable indicator of speech quality. As for Ec/N0, it is not possible to state it as a speech quality indicator however, it shows clear thresholds for which the MOS values decrease significantly. The studied transmission parameters show that they can be used not only for network management purposes but, at the same time, give an expected idea to the communications engineer (or technician) of the end-to-end speech quality consequences. With the conclusion of the work new ideas for future studies come to mind. Considering that the fourth-generation (4G) cellular technologies are now beginning to take an important place in the global market, as the first all-IP network structure, it seems of great relevance that 4G speech quality should be subject of evaluation. Comparing it to 3G, not only in narrowband but also adding wideband scenarios with the most recent standard objective method of speech quality assessment, POLQA. Also, new data found on Ec/N0 tests, justifies further research studies with the intention of validating the assumptions made in this work.
Resumo:
The robotics community is concerned with the ability to infer and compare the results from researchers in areas such as vision perception and multi-robot cooperative behavior. To accomplish that task, this paper proposes a real-time indoor visual ground truth system capable of providing accuracy with at least more magnitude than the precision of the algorithm to be evaluated. A multi-camera architecture is proposed under the ROS (Robot Operating System) framework to estimate the 3D position of objects and the implementation and results were contextualized to the Robocup Middle Size League scenario.
Resumo:
This work presents an automatic calibration method for a vision based external underwater ground-truth positioning system. These systems are a relevant tool in benchmarking and assessing the quality of research in underwater robotics applications. A stereo vision system can in suitable environments such as test tanks or in clear water conditions provide accurate position with low cost and flexible operation. In this work we present a two step extrinsic camera parameter calibration procedure in order to reduce the setup time and provide accurate results. The proposed method uses a planar homography decomposition in order to determine the relative camera poses and the determination of vanishing points of detected lines in the image to obtain the global pose of the stereo rig in the reference frame. This method was applied to our external vision based ground-truth at the INESC TEC/Robotics test tank. Results are presented in comparison with an precise calibration performed using points obtained from an accurate 3D LIDAR modelling of the environment.
Resumo:
In the last few years the number of systems and devices that use voice based interaction has grown significantly. For a continued use of these systems the interface must be reliable and pleasant in order to provide an optimal user experience. However there are currently very few studies that try to evaluate how good is a voice when the application is a speech based interface. In this paper we present a new automatic voice pleasantness classification system based on prosodic and acoustic patterns of voice preference. Our study is based on a multi-language database composed by female voices. In the objective performance evaluation the system achieved a 7.3% error rate.
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
Human Activity Recognition systems require objective and reliable methods that can be used in the daily routine and must offer consistent results according with the performed activities. These systems are under development and offer objective and personalized support for several applications such as the healthcare area. This thesis aims to create a framework for human activities recognition based on accelerometry signals. Some new features and techniques inspired in the audio recognition methodology are introduced in this work, namely Log Scale Power Bandwidth and the Markov Models application. The Forward Feature Selection was adopted as the feature selection algorithm in order to improve the clustering performances and limit the computational demands. This method selects the most suitable set of features for activities recognition in accelerometry from a 423th dimensional feature vector. Several Machine Learning algorithms were applied to the used accelerometry databases – FCHA and PAMAP databases - and these showed promising results in activities recognition. The developed algorithm set constitutes a mighty contribution for the development of reliable evaluation methods of movement disorders for diagnosis and treatment applications.
Resumo:
In cataract surgery, the eye’s natural lens is removed because it has gone opaque and doesn’t allow clear vision any longer. To maintain the eye’s optical power, a new artificial lens must be inserted. Called Intraocular Lens (IOL), it needs to be modelled in order to have the correct refractive power to substitute the natural lens. Calculating the refractive power of this substitution lens requires precise anterior eye chamber measurements. An interferometry equipment, the AC Master from Zeiss Meditec, AG, was in use for half a year to perform these measurements. A Low Coherence Interferometry (LCI) measurement beam is aligned with the eye’s optical axis, for precise measurements of anterior eye chamber distances. The eye follows a fixation target in order to make the visual axis align with the optical axis. Performance problems occurred, however, at this step. Therefore, there was a necessity to develop a new procedure that ensures better alignment between the eye’s visual and optical axes, allowing a more user friendly and versatile procedure, and eventually automatizing the whole process. With this instrument, the alignment between the eye’s optical and visual axes is detected when Purkinje reflections I and III are overlapped, as the eye follows a fixation target. In this project, image analysis is used to detect these Purkinje reflections’ positions, eventually automatically detecting when they overlap. Automatic detection of the third Purkinje reflection of an eye following a fixation target is possible with some restrictions. Each pair of detected third Purkinje reflections is used in automatically calculating an acceptable starting position for the fixation target, required for precise measurements of anterior eye chamber distances.
Resumo:
In this paper, we present an integrated system for real-time automatic detection of human actions from video. The proposed approach uses the boundary of humans as the main feature for recognizing actions. Background subtraction is performed using Gaussian mixture model. Then, features are extracted from silhouettes and Vector Quantization is used to map features into symbols (bag of words approach). Finally, actions are detected using the Hidden Markov Model. The proposed system was validated using a newly collected real- world dataset. The obtained results show that the system is capable of achieving robust human detection, in both indoor and outdoor environments. Moreover, promising classification results were achieved when detecting two basic human actions: walking and sitting.
Resumo:
The main features of most components consist of simple basic functional geometries: planes, cylinders, spheres and cones. Shape and position recognition of these geometries is essential for dimensional characterization of components, and represent an important contribution in the life cycle of the product, concerning in particular the manufacturing and inspection processes of the final product. This work aims to establish an algorithm to automatically recognize such geometries, without operator intervention. Using differential geometry large volumes of data can be treated and the basic functional geometries to be dealt recognized. The original data can be obtained by rapid acquisition methods, such as 3D survey or photography, and then converted into Cartesian coordinates. The satisfaction of intrinsic decision conditions allows different geometries to be fast identified, without operator intervention. Since inspection is generally a time consuming task, this method reduces operator intervention in the process. The algorithm was first tested using geometric data generated in MATLAB and then through a set of data points acquired by measuring with a coordinate measuring machine and a 3D scan on real physical surfaces. Comparison time spent in measuring is presented to show the advantage of the method. The results validated the suitability and potential of the algorithm hereby proposed
Resumo:
This research aims to advance blinking detection in the context of work activity. Rather than patients having to attend a clinic, blinking videos can be acquired in a work environment, and further automatically analyzed. Therefore, this paper presents a methodology to perform the automatic detection of eye blink using consumer videos acquired with low-cost web cameras. This methodology includes the detection of the face and eyes of the recorded person, and then it analyzes the low-level features of the eye region to create a quantitative vector. Finally, this vector is classified into one of the two categories considered —open and closed eyes— by using machine learning algorithms. The effectiveness of the proposed methodology was demonstrated since it provides unbiased results with classification errors under 5%
Resumo:
Dissertação de mestrado em Engenharia e Gestão da Qualidade
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação