11 resultados para sound processing
em Dalarna University College Electronic Archive
Resumo:
Background: Voice processing in real-time is challenging. A drawback of previous work for Hypokinetic Dysarthria (HKD) recognition is the requirement of controlled settings in a laboratory environment. A personal digital assistant (PDA) has been developed for home assessment of PD patients. The PDA offers sound processing capabilities, which allow for developing a module for recognition and quantification HKD. Objective: To compose an algorithm for assessment of PD speech severity in the home environment based on a review synthesis. Methods: A two-tier review methodology is utilized. The first tier focuses on real-time problems in speech detection. In the second tier, acoustics features that are robust to medication changes in Levodopa-responsive patients are investigated for HKD recognition. Keywords such as Hypokinetic Dysarthria , and Speech recognition in real time were used in the search engines. IEEE explorer produced the most useful search hits as compared to Google Scholar, ELIN, EBRARY, PubMed and LIBRIS. Results: Vowel and consonant formants are the most relevant acoustic parameters to reflect PD medication changes. Since relevant speech segments (consonants and vowels) contains minority of speech energy, intelligibility can be improved by amplifying the voice signal using amplitude compression. Pause detection and peak to average power rate calculations for voice segmentation produce rich voice features in real time. Enhancements in voice segmentation can be done by inducing Zero-Crossing rate (ZCR). Consonants have high ZCR whereas vowels have low ZCR. Wavelet transform is found promising for voice analysis since it quantizes non-stationary voice signals over time-series using scale and translation parameters. In this way voice intelligibility in the waveforms can be analyzed in each time frame. Conclusions: This review evaluated HKD recognition algorithms to develop a tool for PD speech home-assessment using modern mobile technology. An algorithm that tackles realtime constraints in HKD recognition based on the review synthesis is proposed. We suggest that speech features may be further processed using wavelet transforms and used with a neural network for detection and quantification of speech anomalies related to PD. Based on this model, patients' speech can be automatically categorized according to UPDRS speech ratings.
Resumo:
This masters thesis describes the development of signal processing and patternrecognition in monitoring Parkison’s disease. It involves the development of a signalprocess algorithm and passing it into a pattern recogniton algorithm also. Thesealgorithms are used to determine , predict and make a conclusion on the study ofparkison’s disease. We get to understand the nature of how the parkinson’s disease isin humans.
Resumo:
The motivation for this thesis work is the need for improving reliability of equipment and quality of service to railway passengers as well as a requirement for cost-effective and efficient condition maintenance management for rail transportation. This thesis work develops a fusion of various machine vision analysis methods to achieve high performance in automation of wooden rail track inspection.The condition monitoring in rail transport is done manually by a human operator where people rely on inference systems and assumptions to develop conclusions. The use of conditional monitoring allows maintenance to be scheduled, or other actions to be taken to avoid the consequences of failure, before the failure occurs. Manual or automated condition monitoring of materials in fields of public transportation like railway, aerial navigation, traffic safety, etc, where safety is of prior importance needs non-destructive testing (NDT).In general, wooden railway sleeper inspection is done manually by a human operator, by moving along the rail sleeper and gathering information by visual and sound analysis for examining the presence of cracks. Human inspectors working on lines visually inspect wooden rails to judge the quality of rail sleeper. In this project work the machine vision system is developed based on the manual visual analysis system, which uses digital cameras and image processing software to perform similar manual inspections. As the manual inspection requires much effort and is expected to be error prone sometimes and also appears difficult to discriminate even for a human operator by the frequent changes in inspected material. The machine vision system developed classifies the condition of material by examining individual pixels of images, processing them and attempting to develop conclusions with the assistance of knowledge bases and features.A pattern recognition approach is developed based on the methodological knowledge from manual procedure. The pattern recognition approach for this thesis work was developed and achieved by a non destructive testing method to identify the flaws in manually done condition monitoring of sleepers.In this method, a test vehicle is designed to capture sleeper images similar to visual inspection by human operator and the raw data for pattern recognition approach is provided from the captured images of the wooden sleepers. The data from the NDT method were further processed and appropriate features were extracted.The collection of data by the NDT method is to achieve high accuracy in reliable classification results. A key idea is to use the non supervised classifier based on the features extracted from the method to discriminate the condition of wooden sleepers in to either good or bad. Self organising map is used as classifier for the wooden sleeper classification.In order to achieve greater integration, the data collected by the machine vision system was made to interface with one another by a strategy called fusion. Data fusion was looked in at two different levels namely sensor-level fusion, feature- level fusion. As the goal was to reduce the accuracy of the human error on the rail sleeper classification as good or bad the results obtained by the feature-level fusion compared to that of the results of actual classification were satisfactory.
Resumo:
Though sound symbolic words (onomatopoeia and mimetic words, or giongo and gitaigo in Japanese) exist in other languages, it would not be so easy to compare them to those in Japanese. This is because unlike in Japanese, in many other languages (here we see English and Spanish) sound symbolic words do not have distinctive forms that separate them immediately from the rest of categories of words. In Japanese, a sound symbolic word has a radical (that is based on the elaborated Japanese sound symbolic system), and often a suffix that shows subtle nuance. Together they give the word a distinctive form that differentiates it from other categories of words, though its grammatical functions could vary, especially in the case of mimetic words (gitaigo). Without such an obvious feature, in other languages, it would not be always easy to separate sound symbolic words from the rest. These expressions are extremely common and used in almost all types of text in Japanese, but their elaborated sound symbolic system and possibly their various grammatical functions are making giongo and gitaigo one of the most difficult challenges for the foreign students and translators. Studying the translation of these expressions into other languages might give some indication related to the comparison of Japanese sound symbolic words and those in other languages. Though sound symbolic words are present in many types of texts in Japanese, their functions in traditional forms of text (letters only) and manga (Japanese comics)are different and they should be treated separately. For example, in traditional types of text such as novels, the vast majority of the sound symbolic words used are mimetic words (gitaigo) and most of them are used as adverbs, whereas in manga, the majority of the sound symbolic words used (excluding those appear within the speech bubbles) are onomatopoeias (giongo) and often used on their own (i.e. not as a part of a sentence). Naturally, the techniques used to translate these expressions in the above two types of documents differ greatly. The presentation will focus on i) grammatical functions of Japanese sound symbolic words in traditional types of texts (novels/poems) and in manga works, and ii) whether their features and functions are maintained (i.e. whether they are translated as sound symbolic words) when translated into other languages (English and Spanish). The latter point should be related to a comparison of sound symbolic words in Japanese and other languages, which will be also discussed.
Resumo:
For the past few decades, researchers have increased our understanding of how sound functions within various audio–visual media formats. With a different focus in mind, this study aims to identify the roles and functions of sound in relation to the game form Audio Games, in order to explore the potential of sound when acting as an autonomous narrative form. Because this is still a relatively unexplored research field, the main purpose of this study is to help establish a theoretical ground and stimulate further research within the field of audio games. By adopting an interdisciplinary approach to the topic, this research relies on theoretical studies, examinations of audio games and contact with the audio game community. In order to reveal the roles of sound, the gathered data is analyzed according to both a contextual and a functional perspective. The research shows that a distinction between the terms ‘function’ and ‘role’ is important when analyzing sound in digital games. The analysis therefore results in the identification of two analytical levels that help define the functions and roles of an entity within a social context, named the Functional and the Interfunctional levels. In addition to successfully identifying three main roles of sound within audio games—each describing the relationship between sound and the entities game system, player and virtual environment—many other issues are also addressed. Consequently, and in accordance with its purpose, this study provides a broad foundation for further research of sound in both audio games and video games.
Resumo:
GPS technology has been embedded into portable, low-cost electronic devices nowadays to track the movements of mobile objects. This implication has greatly impacted the transportation field by creating a novel and rich source of traffic data on the road network. Although the promise offered by GPS devices to overcome problems like underreporting, respondent fatigue, inaccuracies and other human errors in data collection is significant; the technology is still relatively new that it raises many issues for potential users. These issues tend to revolve around the following areas: reliability, data processing and the related application. This thesis aims to study the GPS tracking form the methodological, technical and practical aspects. It first evaluates the reliability of GPS based traffic data based on data from an experiment containing three different traffic modes (car, bike and bus) traveling along the road network. It then outline the general procedure for processing GPS tracking data and discuss related issues that are uncovered by using real-world GPS tracking data of 316 cars. Thirdly, it investigates the influence of road network density in finding optimal location for enhancing travel efficiency and decreasing travel cost. The results show that the geographical positioning is reliable. Velocity is slightly underestimated, whereas altitude measurements are unreliable.Post processing techniques with auxiliary information is found necessary and important when solving the inaccuracy of GPS data. The densities of the road network influence the finding of optimal locations. The influence will stabilize at a certain level and do not deteriorate when the node density is higher.
Resumo:
The advancement of GPS technology enables GPS devices not only to be used as orientation and navigation tools, but also to track travelled routes. GPS tracking data provides essential information for a broad range of urban planning applications such as transportation routing and planning, traffic management and environmental control. This paper describes on processing the data that was collected by tracking the cars of 316 volunteers over a seven-week period. The detailed information is extracted. The processed data is further connected to the underlying road network by means of maps. Geographical maps are applied to check how the car-movements match the road network. The maps capture the complexity of the car-movements in the urban area. The results show that 90% of the trips on the plane match the road network within a tolerance.
Resumo:
This paper summarises the results of using image processing technique to get information about the load of timber trucks before their arrival using digital images or geo tagged images. Once the images are captured and sent to sawmill by drivers from forest, we can predict their arrival time using geo tagged coordinates, count the number of (timber) logs piled up in a truck, identify their type and calculate their diameter. With this information we can schedule and prioritise the inflow and unloading of trucks in the light of production schedules and raw material stocks available at the sawmill yard. It is important to keep all the actors in a supply chain integrated coordinated, so that optimal working routines can be reached in the sawmill yard.
Resumo:
Speech perception runs smoothly and automatically when there is silence in the background, but when the speech signal is degraded by background noise or by reverberation, effortful cognitive processing is needed to compensate for the signal distortion. Previous research has typically investigated the effects of signal-to-noise ratio (SNR) and reverberation time in isolation, whilst few have looked at their interaction. In this study, we probed how reverberation time and SNR influence recall of words presented in participants' first- (L1) and second-language (L2). A total of 72 children (10 years old) participated in this study. The to-be-recalled wordlists were played back with two different reverberation times (0.3 and 1.2 s) crossed with two different SNRs (+3 dBA and +12 dBA). Children recalled fewer words when the spoken words were presented in L2 in comparison with recall of spoken words presented in L1. Words that were presented with a high SNR (+12 dBA) improved recall compared to a low SNR (+3 dBA). Reverberation time interacted with SNR to the effect that at +12 dB the shorter reverberation time improved recall, but at +3 dB it impaired recall. The effects of the physical sound variables (SNR and reverberation time) did not interact with language. © 2016 Hurtig, Keus van de Poll, Pekkola, Hygge, Ljung and Sörqvist.
Resumo:
The demands of image processing related systems are robustness, high recognition rates, capability to handle incomplete digital information, and magnanimous flexibility in capturing shape of an object in an image. It is exactly here that, the role of convex hulls comes to play. The objective of this paper is twofold. First, we summarize the state of the art in computational convex hull development for researchers interested in using convex hull image processing to build their intuition, or generate nontrivial models. Secondly, we present several applications involving convex hulls in image processing related tasks. By this, we have striven to show researchers the rich and varied set of applications they can contribute to. This paper also makes a humble effort to enthuse prospective researchers in this area. We hope that the resulting awareness will result in new advances for specific image recognition applications.