Biblioteca Digital

15 resultados para Sound recognition

em Dalarna University College Electronic Archive

Road and Traffic Signs Recognition using Vector Machines

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Intelligent Transportation System (ITS) is a system that builds a safe, effective and integrated transportation environment based on advanced technologies. Road signs detection and recognition is an important part of ITS, which offer ways to collect the real time traffic data for processing at a central facility.This project is to implement a road sign recognition model based on AI and image analysis technologies, which applies a machine learning method, Support Vector Machines, to recognize road signs. We focus on recognizing seven categories of road sign shapes and five categories of speed limit signs. Two kinds of features, binary image and Zernike moments, are used for representing the data to the SVM for training and test. We compared and analyzed the performances of SVM recognition model using different features and different kernels. Moreover, the performances using different recognition models, SVM and Fuzzy ARTMAP, are observed.

Road Sign Recognition based onInvariant Features using SupportVector Machine

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Since last two decades researches have been working on developing systems that can assistsdrivers in the best way possible and make driving safe. Computer vision has played a crucialpart in design of these systems. With the introduction of vision techniques variousautonomous and robust real-time traffic automation systems have been designed such asTraffic monitoring, Traffic related parameter estimation and intelligent vehicles. Among theseautomatic detection and recognition of road signs has became an interesting research topic.The system can assist drivers about signs they don’t recognize before passing them.Aim of this research project is to present an Intelligent Road Sign Recognition System basedon state-of-the-art technique, the Support Vector Machine. The project is an extension to thework done at ITS research Platform at Dalarna University [25]. Focus of this research work ison the recognition of road signs under analysis. When classifying an image its location, sizeand orientation in the image plane are its irrelevant features and one way to get rid of thisambiguity is to extract those features which are invariant under the above mentionedtransformation. These invariant features are then used in Support Vector Machine forclassification. Support Vector Machine is a supervised learning machine that solves problemin higher dimension with the help of Kernel functions and is best know for classificationproblems.

Real-Time Recognition System for Traffic Signs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this thesis project is to develop the Traffic Sign Recognition algorithm for real time. Inreal time environment, vehicles move at high speed on roads. For the vehicle intelligent system itbecomes essential to detect, process and recognize the traffic sign which is coming in front ofvehicle with high relative velocity, at the right time, so that the driver would be able to pro-actsimultaneously on instructions given in the Traffic Sign. The system assists drivers about trafficsigns they did not recognize before passing them. With the Traffic Sign Recognition system, thevehicle becomes aware of the traffic environment and reacts according to the situation.The objective of the project is to develop a system which can recognize the traffic signs in real time.The three target parameters are the system’s response time in real-time video streaming, the trafficsign recognition speed in still images and the recognition accuracy. The system consists of threeprocesses; the traffic sign detection, the traffic sign recognition and the traffic sign tracking. Thedetection process uses physical properties of traffic signs based on a priori knowledge to detect roadsigns. It generates the road sign image as the input to the recognition process. The recognitionprocess is implemented using the Pattern Matching algorithm. The system was first tested onstationary images where it showed on average 97% accuracy with the average processing time of0.15 seconds for traffic sign recognition. This procedure was then applied to the real time videostreaming. Finally the tracking of traffic signs was developed using Blob tracking which showed theaverage recognition accuracy to 95% in real time and improved the system’s average response timeto 0.04 seconds. This project has been implemented in C-language using the Open Computer VisionLibrary.

MACHINE VISION FOR AUTOMATICVISUAL INSPECTION OF WOODENRAILWAY SLEEPERS USING UNSUPERVISED NEURAL NETWORKS

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The motivation for this thesis work is the need for improving reliability of equipment and quality of service to railway passengers as well as a requirement for cost-effective and efficient condition maintenance management for rail transportation. This thesis work develops a fusion of various machine vision analysis methods to achieve high performance in automation of wooden rail track inspection.The condition monitoring in rail transport is done manually by a human operator where people rely on inference systems and assumptions to develop conclusions. The use of conditional monitoring allows maintenance to be scheduled, or other actions to be taken to avoid the consequences of failure, before the failure occurs. Manual or automated condition monitoring of materials in fields of public transportation like railway, aerial navigation, traffic safety, etc, where safety is of prior importance needs non-destructive testing (NDT).In general, wooden railway sleeper inspection is done manually by a human operator, by moving along the rail sleeper and gathering information by visual and sound analysis for examining the presence of cracks. Human inspectors working on lines visually inspect wooden rails to judge the quality of rail sleeper. In this project work the machine vision system is developed based on the manual visual analysis system, which uses digital cameras and image processing software to perform similar manual inspections. As the manual inspection requires much effort and is expected to be error prone sometimes and also appears difficult to discriminate even for a human operator by the frequent changes in inspected material. The machine vision system developed classifies the condition of material by examining individual pixels of images, processing them and attempting to develop conclusions with the assistance of knowledge bases and features.A pattern recognition approach is developed based on the methodological knowledge from manual procedure. The pattern recognition approach for this thesis work was developed and achieved by a non destructive testing method to identify the flaws in manually done condition monitoring of sleepers.In this method, a test vehicle is designed to capture sleeper images similar to visual inspection by human operator and the raw data for pattern recognition approach is provided from the captured images of the wooden sleepers. The data from the NDT method were further processed and appropriate features were extracted.The collection of data by the NDT method is to achieve high accuracy in reliable classification results. A key idea is to use the non supervised classifier based on the features extracted from the method to discriminate the condition of wooden sleepers in to either good or bad. Self organising map is used as classifier for the wooden sleeper classification.In order to achieve greater integration, the data collected by the machine vision system was made to interface with one another by a strategy called fusion. Data fusion was looked in at two different levels namely sensor-level fusion, feature- level fusion. As the goal was to reduce the accuracy of the human error on the rail sleeper classification as good or bad the results obtained by the feature-level fusion compared to that of the results of actual classification were satisfactory.

An Upgrade of Network Traffic Recognition System for SIP/VoIP Traffic Recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The purpose of this project is to update the tool of Network Traffic Recognition System (NTRS) which is proprietary software of Ericsson AB and Tsinghua University, and to implement the updated tool to finish SIP/VoIP traffic recognition. Basing on the original NTRS, I analyze the traffic recognition principal of NTRS, and redesign the structure and module of the tool according to characteristics of SIP/VoIP traffic, and then finally I program to achieve the upgrade. After the final test with our SIP data trace files in the updated system, a satisfactory result is derived. The result presents that our updated system holds a rate of recognition on a confident level in the SIP session recognition as well as the VoIP call recognition. In the comparison with the software of Wireshark, our updated system has a result which is extremely close to Wireshark’s output, and the working time is much less than Wireshark. In the aspect of practicability, the memory overflow problem is avoided, and the updated system can output the specific information of SIP/VoIP traffic recognition, such as SIP type, SIP state, VoIP state, etc. The upgrade fulfills the demand of this project.

Hand Gesture Detection & Recognition System

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The project introduces an application using computer vision for Hand gesture recognition. A camera records a live video stream, from which a snapshot is taken with the help of interface. The system is trained for each type of count hand gestures (one, two, three, four, and five) at least once. After that a test gesture is given to it and the system tries to recognize it.A research was carried out on a number of algorithms that could best differentiate a hand gesture. It was found that the diagonal sum algorithm gave the highest accuracy rate. In the preprocessing phase, a self-developed algorithm removes the background of each training gesture. After that the image is converted into a binary image and the sums of all diagonal elements of the picture are taken. This sum helps us in differentiating and classifying different hand gestures.Previous systems have used data gloves or markers for input in the system. I have no such constraints for using the system. The user can give hand gestures in view of the camera naturally. A completely robust hand gesture recognition system is still under heavy research and development; the implemented system serves as an extendible foundation for future work.

The functions of Japanese Sound Symbolic Words in Different Types of Texts and Their Translation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Though sound symbolic words (onomatopoeia and mimetic words, or giongo and gitaigo in Japanese) exist in other languages, it would not be so easy to compare them to those in Japanese. This is because unlike in Japanese, in many other languages (here we see English and Spanish) sound symbolic words do not have distinctive forms that separate them immediately from the rest of categories of words. In Japanese, a sound symbolic word has a radical (that is based on the elaborated Japanese sound symbolic system), and often a suffix that shows subtle nuance. Together they give the word a distinctive form that differentiates it from other categories of words, though its grammatical functions could vary, especially in the case of mimetic words (gitaigo). Without such an obvious feature, in other languages, it would not be always easy to separate sound symbolic words from the rest. These expressions are extremely common and used in almost all types of text in Japanese, but their elaborated sound symbolic system and possibly their various grammatical functions are making giongo and gitaigo one of the most difficult challenges for the foreign students and translators. Studying the translation of these expressions into other languages might give some indication related to the comparison of Japanese sound symbolic words and those in other languages. Though sound symbolic words are present in many types of texts in Japanese, their functions in traditional forms of text (letters only) and manga (Japanese comics)are different and they should be treated separately. For example, in traditional types of text such as novels, the vast majority of the sound symbolic words used are mimetic words (gitaigo) and most of them are used as adverbs, whereas in manga, the majority of the sound symbolic words used (excluding those appear within the speech bubbles) are onomatopoeias (giongo) and often used on their own (i.e. not as a part of a sentence). Naturally, the techniques used to translate these expressions in the above two types of documents differ greatly. The presentation will focus on i) grammatical functions of Japanese sound symbolic words in traditional types of texts (novels/poems) and in manga works, and ii) whether their features and functions are maintained (i.e. whether they are translated as sound symbolic words) when translated into other languages (English and Spanish). The latter point should be related to a comparison of sound symbolic words in Japanese and other languages, which will be also discussed.

Game Audio in Audio Games : Towards a Theory on the Roles and Functions of Sound in Audio Games

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For the past few decades, researchers have increased our understanding of how sound functions within various audio–visual media formats. With a different focus in mind, this study aims to identify the roles and functions of sound in relation to the game form Audio Games, in order to explore the potential of sound when acting as an autonomous narrative form. Because this is still a relatively unexplored research field, the main purpose of this study is to help establish a theoretical ground and stimulate further research within the field of audio games. By adopting an interdisciplinary approach to the topic, this research relies on theoretical studies, examinations of audio games and contact with the audio game community. In order to reveal the roles of sound, the gathered data is analyzed according to both a contextual and a functional perspective. The research shows that a distinction between the terms ‘function’ and ‘role’ is important when analyzing sound in digital games. The analysis therefore results in the identification of two analytical levels that help define the functions and roles of an entity within a social context, named the Functional and the Interfunctional levels. In addition to successfully identifying three main roles of sound within audio games—each describing the relationship between sound and the entities game system, player and virtual environment—many other issues are also addressed. Consequently, and in accordance with its purpose, this study provides a broad foundation for further research of sound in both audio games and video games.

Assessment of PD Speech Anomalies @ Home

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Voice processing in real-time is challenging. A drawback of previous work for Hypokinetic Dysarthria (HKD) recognition is the requirement of controlled settings in a laboratory environment. A personal digital assistant (PDA) has been developed for home assessment of PD patients. The PDA offers sound processing capabilities, which allow for developing a module for recognition and quantification HKD. Objective: To compose an algorithm for assessment of PD speech severity in the home environment based on a review synthesis. Methods: A two-tier review methodology is utilized. The first tier focuses on real-time problems in speech detection. In the second tier, acoustics features that are robust to medication changes in Levodopa-responsive patients are investigated for HKD recognition. Keywords such as Hypokinetic Dysarthria , and Speech recognition in real time were used in the search engines. IEEE explorer produced the most useful search hits as compared to Google Scholar, ELIN, EBRARY, PubMed and LIBRIS. Results: Vowel and consonant formants are the most relevant acoustic parameters to reflect PD medication changes. Since relevant speech segments (consonants and vowels) contains minority of speech energy, intelligibility can be improved by amplifying the voice signal using amplitude compression. Pause detection and peak to average power rate calculations for voice segmentation produce rich voice features in real time. Enhancements in voice segmentation can be done by inducing Zero-Crossing rate (ZCR). Consonants have high ZCR whereas vowels have low ZCR. Wavelet transform is found promising for voice analysis since it quantizes non-stationary voice signals over time-series using scale and translation parameters. In this way voice intelligibility in the waveforms can be analyzed in each time frame. Conclusions: This review evaluated HKD recognition algorithms to develop a tool for PD speech home-assessment using modern mobile technology. An algorithm that tackles realtime constraints in HKD recognition based on the review synthesis is proposed. We suggest that speech features may be further processed using wavelet transforms and used with a neural network for detection and quantification of speech anomalies related to PD. Based on this model, patients' speech can be automatically categorized according to UPDRS speech ratings.

Motion Cues Analysis for Parkinson Gait Recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Previous assessment methods for PG recognition used sensor mechanisms for PG that may cause discomfort. In order to avoid stress of applying wearable sensors, computer vision (CV) based diagnostic systems for PG recognition have been proposed. Main constraints in these methods are the laboratory setup procedures: Novel colored dresses for the patients were specifically designed to segment the test body from a specific colored background. Objective: To develop an image processing tool for home-assessment of Parkinson Gait(PG) by analyzing motion cues extracted during the gait cycles. Methods: The system is based on the idea that a normal body attains equilibrium during the gait by aligning the body posture with the axis of gravity. Due to the rigidity in muscular tone, persons with PD fail to align their bodies with the axis of gravity. The leaned posture of PD patients appears to fall forward. Whereas a normal posture exhibits a constant erect posture throughout the gait. Patients with PD walk with shortened stride angle (less than 15 degrees on average) with high variability in the stride frequency. Whereas a normal gait exhibits a constant stride frequency with an average stride angle of 45 degrees. In order to analyze PG, levodopa-responsive patients and normal controls were videotaped with several gait cycles. First, the test body is segmented in each frame of the gait video based on the pixel contrast from the background to form a silhouette. Next, the center of gravity of this silhouette is calculated. This silhouette is further skeletonized from the video frames to extract the motion cues. Two motion cues were stride frequency based on the cyclic leg motion and the lean frequency based on the angle between the leaned torso tangent and the axis of gravity. The differences in the peaks in stride and lean frequencies between PG and normal gait are calculated using Cosine Similarity measurements. Results: High cosine dissimilarity was observed in the stride and lean frequencies between PG and normal gait. High variations are found in the stride intervals of PG whereas constant stride intervals are found in the normal gait. Conclusions: We propose an algorithm as a source to eliminate laboratory constraints and discomfort during PG analysis. Installing this tool in a home computer with a webcam allows assessment of gait in the home environment.

Dialekter och röstigenkänning : Ett röstigenkännings-API:s förmåga att uppfatta svenska dialekters kännetecken och röstkombinationer

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Allt eftersom utvecklingen går framåt inom applikationer och system så förändras också sättet på vilket vi interagerar med systemet på. Hittills har navigering och användning av applikationer och system mestadels skett med händerna och då genom mus och tangentbord. På senare tid så har navigering via touch-skärmar och rösten blivit allt mer vanligt. Då man ska styra en applikation med hjälp av rösten är det viktigt att vem som helst kan styra applikationen, oavsett vilken dialekt man har. För att kunna se hur korrekt ett röstigenkännings-API (Application Programming Interface) uppfattar svenska dialekter så initierades denna studie med dokumentstudier om dialekters kännetecken och ljudkombinationer. Dessa kännetecken och ljudkombinationer låg till grund för de ord vi valt ut till att testa API:et med. Varje dialekt fick alltså ett ord uppbyggt för att vara extra svårt för API:et att uppfatta när det uttalades av just den aktuella dialekten. Därefter utvecklades en prototyp, närmare bestämt en android-applikation som fungerade som ett verktyg i datainsamlingen. Då arbetet innehåller en prototyp och en undersökning så valdes Design and Creation Research som forskningsstrategi med datainsamlingsmetoderna dokumentstudier och observationer för att få önskat resultat. Data samlades in via observationer med prototypen som hjälpmedel och med hjälp av dokumentstudier. Det empiriska data som registrerats via observationerna och med hjälp av applikationen påvisade att vissa dialekter var lättare för API:et att uppfatta korrekt. I vissa fall var resultaten väntade då vissa ord uppbyggda av ljudkombinationer i enlighet med teorin skulle uttalas väldigt speciellt av en viss dialekt. Ibland blev det väldigt låga resultat på just dessa ord men i andra fall förvånansvärt höga. Slutsatsen vi drog av detta var att de ord vi valt ut med en baktanke om att de skulle få låga resultat för den speciella dialekten endast visade sig stämma vid två tillfällen. Det var istället det ord innehållande sje- och tje-ljud som enligt teorin var gemensamma kännetecken för alla dialekter som fick lägst resultat överlag.

What affects recognition most – wrong wordstress or wrong word accent?

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In an attempt to find out which of the two Swedish prosodic contrasts of 1) wordstress pattern and 2) tonal word accent category has the greatest communicative weight, a lexical decision experiment was conducted: in one part word stress pattern was changed from trochaic to iambic, and in the other part trochaic accentII words were changed to accent I.Native Swedish listeners were asked to decide whether the distorted words werereal words or ‘non-words’. A clear tendency is that listeners preferred to give more‘non-word’ responses when the stress pattern was shifted, compared to when wordaccent category was shifted. This could have implications for priority of phonological features when teaching Swedish as a second language.

Traffic and Road Sign Recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents a system to recognise and classify road and traffic signs for the purpose of developing an inventory of them which could assist the highway engineers’ tasks of updating and maintaining them. It uses images taken by a camera from a moving vehicle. The system is based on three major stages: colour segmentation, recognition, and classification. Four colour segmentation algorithms are developed and tested. They are a shadow and highlight invariant, a dynamic threshold, a modification of de la Escalera’s algorithm and a Fuzzy colour segmentation algorithm. All algorithms are tested using hundreds of images and the shadow-highlight invariant algorithm is eventually chosen as the best performer. This is because it is immune to shadows and highlights. It is also robust as it was tested in different lighting conditions, weather conditions, and times of the day. Approximately 97% successful segmentation rate was achieved using this algorithm.Recognition of traffic signs is carried out using a fuzzy shape recogniser. Based on four shape measures - the rectangularity, triangularity, ellipticity, and octagonality, fuzzy rules were developed to determine the shape of the sign. Among these shape measures octangonality has been introduced in this research. The final decision of the recogniser is based on the combination of both the colour and shape of the sign. The recogniser was tested in a variety of testing conditions giving an overall performance of approximately 88%.Classification was undertaken using a Support Vector Machine (SVM) classifier. The classification is carried out in two stages: rim’s shape classification followed by the classification of interior of the sign. The classifier was trained and tested using binary images in addition to five different types of moments which are Geometric moments, Zernike moments, Legendre moments, Orthogonal Fourier-Mellin Moments, and Binary Haar features. The performance of the SVM was tested using different features, kernels, SVM types, SVM parameters, and moment’s orders. The average classification rate achieved is about 97%. Binary images show the best testing results followed by Legendre moments. Linear kernel gives the best testing results followed by RBF. C-SVM shows very good performance, but ?-SVM gives better results in some case.

Swedish translation and psychometric testing of the safety attitudes questionnaire (operating room version)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Tens of millions of patients worldwide suffer from avoidable disabling injuries and death every year. Measuring the safety climate in health care is an important step in improving patient safety. The most commonly used instrument to measure safety climate is the Safety Attitudes Questionnaire (SAQ). The aim of the present study was to establish the validity and reliability of the translated version of the SAQ. Methods: The SAQ was translated and adapted to the Swedish context. The survey was then carried out with 374 respondents in the operating room (OR) setting. Data was received from three hospitals, a total of 237 responses. Cronbach's alpha and confirmatory factor analysis (CFA) was used to evaluate the reliability and validity of the instrument. Results: The Cronbach's alpha values for each of the factors of the SAQ ranged between 0.59 and 0.83. The CFA and its goodness-of-fit indices (SRMR 0.055, RMSEA 0.043, CFI 0.98) showed good model fit. Intercorrelations between the factors safety climate, teamwork climate, job satisfaction, perceptions of management, and working conditions showed moderate to high correlation with each other. The factor stress recognition had no significant correlation with teamwork climate, perception of management, or job satisfaction. Conclusions: Therefore, the Swedish translation and psychometric testing of the SAQ (OR version) has good construct validity. However, the reliability analysis suggested that some of the items need further refinement to establish sound internal consistency. As suggested by previous research, the SAQ is potentially a useful tool for evaluating safety climate. However, further psychometric testing is required with larger samples to establish the psychometric properties of the instrument for use in Sweden.

Motion cue analysis for parkinsonian gait recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a computer-vision based marker-free method for gait-impairment detection in Patients with Parkinson's disease (PWP). The system is based upon the idea that a normal human body attains equilibrium during the gait by aligning the body posture with Axis-of-Gravity (AOG) using feet as the base of support. In contrast, PWP appear to be falling forward as they are less-able to align their body with AOG due to rigid muscular tone. A normal gait exhibits periodic stride-cycles with stride-angle around 45o between the legs, whereas PWP walk with shortened stride-angle with high variability between the stride-cycles. In order to analyze Parkinsonian-gait (PG), subjects were videotaped with several gait-cycles. The subject's body was segmented using a color-segmentation method to form a silhouette. The silhouette was skeletonized for motion cues extraction. The motion cues analyzed were stride-cycles (based on the cyclic leg motion of skeleton) and posture lean (based on the angle between leaned torso of skeleton and AOG). Cosine similarity between an imaginary perfect gait pattern and the subject gait patterns produced 100% recognition rate of PG for 4 normal-controls and 3 PWP. Results suggested that the method is a promising tool to be used for PG assessment in home-environment.