967 resultados para Speaker Recognition, Text-constrained, Multilingual, Speaker Verification, HMMs
Resumo:
Since last two decades researches have been working on developing systems that can assistsdrivers in the best way possible and make driving safe. Computer vision has played a crucialpart in design of these systems. With the introduction of vision techniques variousautonomous and robust real-time traffic automation systems have been designed such asTraffic monitoring, Traffic related parameter estimation and intelligent vehicles. Among theseautomatic detection and recognition of road signs has became an interesting research topic.The system can assist drivers about signs they don’t recognize before passing them.Aim of this research project is to present an Intelligent Road Sign Recognition System basedon state-of-the-art technique, the Support Vector Machine. The project is an extension to thework done at ITS research Platform at Dalarna University [25]. Focus of this research work ison the recognition of road signs under analysis. When classifying an image its location, sizeand orientation in the image plane are its irrelevant features and one way to get rid of thisambiguity is to extract those features which are invariant under the above mentionedtransformation. These invariant features are then used in Support Vector Machine forclassification. Support Vector Machine is a supervised learning machine that solves problemin higher dimension with the help of Kernel functions and is best know for classificationproblems.
Resumo:
The aim of this thesis project is to develop the Traffic Sign Recognition algorithm for real time. Inreal time environment, vehicles move at high speed on roads. For the vehicle intelligent system itbecomes essential to detect, process and recognize the traffic sign which is coming in front ofvehicle with high relative velocity, at the right time, so that the driver would be able to pro-actsimultaneously on instructions given in the Traffic Sign. The system assists drivers about trafficsigns they did not recognize before passing them. With the Traffic Sign Recognition system, thevehicle becomes aware of the traffic environment and reacts according to the situation.The objective of the project is to develop a system which can recognize the traffic signs in real time.The three target parameters are the system’s response time in real-time video streaming, the trafficsign recognition speed in still images and the recognition accuracy. The system consists of threeprocesses; the traffic sign detection, the traffic sign recognition and the traffic sign tracking. Thedetection process uses physical properties of traffic signs based on a priori knowledge to detect roadsigns. It generates the road sign image as the input to the recognition process. The recognitionprocess is implemented using the Pattern Matching algorithm. The system was first tested onstationary images where it showed on average 97% accuracy with the average processing time of0.15 seconds for traffic sign recognition. This procedure was then applied to the real time videostreaming. Finally the tracking of traffic signs was developed using Blob tracking which showed theaverage recognition accuracy to 95% in real time and improved the system’s average response timeto 0.04 seconds. This project has been implemented in C-language using the Open Computer VisionLibrary.
Resumo:
The purpose of this project is to update the tool of Network Traffic Recognition System (NTRS) which is proprietary software of Ericsson AB and Tsinghua University, and to implement the updated tool to finish SIP/VoIP traffic recognition. Basing on the original NTRS, I analyze the traffic recognition principal of NTRS, and redesign the structure and module of the tool according to characteristics of SIP/VoIP traffic, and then finally I program to achieve the upgrade. After the final test with our SIP data trace files in the updated system, a satisfactory result is derived. The result presents that our updated system holds a rate of recognition on a confident level in the SIP session recognition as well as the VoIP call recognition. In the comparison with the software of Wireshark, our updated system has a result which is extremely close to Wireshark’s output, and the working time is much less than Wireshark. In the aspect of practicability, the memory overflow problem is avoided, and the updated system can output the specific information of SIP/VoIP traffic recognition, such as SIP type, SIP state, VoIP state, etc. The upgrade fulfills the demand of this project.
Resumo:
The project introduces an application using computer vision for Hand gesture recognition. A camera records a live video stream, from which a snapshot is taken with the help of interface. The system is trained for each type of count hand gestures (one, two, three, four, and five) at least once. After that a test gesture is given to it and the system tries to recognize it.A research was carried out on a number of algorithms that could best differentiate a hand gesture. It was found that the diagonal sum algorithm gave the highest accuracy rate. In the preprocessing phase, a self-developed algorithm removes the background of each training gesture. After that the image is converted into a binary image and the sums of all diagonal elements of the picture are taken. This sum helps us in differentiating and classifying different hand gestures.Previous systems have used data gloves or markers for input in the system. I have no such constraints for using the system. The user can give hand gestures in view of the camera naturally. A completely robust hand gesture recognition system is still under heavy research and development; the implemented system serves as an extendible foundation for future work.
Resumo:
The occurrence of pauses and hesitations in spontaneous speech has been shown to occur systematically, for example, "between sentences, after discourse markers and conjunctions and before accented content words." (Hansson [15]) This is certainly plausible in English, where pauses and hesitations can and often do occur before content words such as nominals, for example, "uh, there's a … man." (Chafe [8]) However, if hesitations are, in fact, evidence of "deciding what to talk about next," (Chafe [8]) then the complex grammatical system of German should render this pausing position precarious, since pre-modifiers must account for the gender of the nominals they modify.In this paper, I present data to test the hypothesis that pre-nominal hesitation patterns in German are dissimilar to those in English. Hesitations in German will be shown, in fact, to occur within noun phrase units. Nevertheless, native speakers most often succeed in supplying a nominal which conforms to the gender indicated by the determiner or pre-modifier. Corrections, or repairs, of infelicitous pre-modifiers indicate that the speaker was unable to supply a nominal of the same gender which the choice of pre-modifier had committed him/her to. The frequency of such repairs is shown to vary according to task, with fewest repairs occurring in elicited speech which allows for linguistic freedom and therefore is most like spontaneous speech. The data sets indicate that among German native speakers, hesitations occurring before noun phrase units (pre-NPU hesitations) indicate deliberation of what to say, while hesitations within or before the head of the noun phrase (pre-NPH hesitations) indicate deliberation of how to say what has already been decided (cf. Chafe [8]).
Resumo:
The features of non-native speech which distinguish it from native speech are often difficult to pin down. It is possible to be a native speaker of any of a vast number of varieties of English. These varieties each have their phonetic characteristics which allow them to be identified by speakers of the varieties in question and by others. The phonetic differences between the accents represented by these varieties are very great. It is impossible to indicate any particular configuration of vowels in the acoustic vowel space or set of consonant articulations which all native-speaker varieties of English have in common and which non-native speakers do not share. This study considers the vowel quality in a single word by native and non-native speakers.
Resumo:
The proposed presentation is a progress report from a project which is aimed at establishing some phonetic correlates of language dominance in various kinds of bilingual situations. The current object of study is Swedish students starting in classes which prepare for the International Baccalaureate (IB) programme. The IB classes in Sweden are taught in English, except for classes in Swedish and foreign languages. This means that after they enter the programme the students are exposed to and speak a good deal more English than previously.The assumption made by many students that they will, on the one hand not “damage” their Swedish, and on the other will dramatically improve their English simply by attending an English-medium school will be tested. The linguistic background of the students studied and their reasons for choosing the IB programme will be established. Their English and Swedish proficiency will be tested according to various parameters (native-like syntax, perceived foreign accent, the timing of vowels and consonants in VC sequences, vocabulary mobilisation) on arrival at the school, and again after one and three years at the school. The initial recordings are now underway.In a preliminary study involving just three young people who were bilingual in Swedish and English, the timing of the pronunciation of (C)VC syllables in Swedish and English was studied. The results of this investigation indicate that it may be possible to establish language dominance in bilingual speakers using timing data. It was found that the three subjects differed systematically in their pronunciation of the target words. One subject (15 years old), who was apparently native-like in both languages, had the V-C timing of both Swedish and English words of a native speaker of English. His brother (17 years old), who had a noticeable Swedish accent in English, pronounced both Swedish and English words in this respect like a native speaker of Swedish. The boys’ sister (9 years old) apparently had native-like timing in both languages.
Resumo:
In the field of bilingualism it is of particular interest to stablish which, if any, of a speaker’s languages is dominant. Earlier research has shown that immigrants who acquire a new language tend to use elements of the timing patterns of the new language in their native language. It is shown here that measurements of timing in the two languages spoken by bilingual children can give information about the relative dominance of the languages for the individual speaker.
Resumo:
Modular product architectures have generated numerous benefits for companies in terms of cost, lead-time and quality. The defined interfaces and the module’s properties decrease the effort to develop new product variants, and provide an opportunity to perform parallel tasks in design, manufacturing and assembly. The background of this thesis is that companies perform verifications (tests, inspections and controls) of products late, when most of the parts have been assembled. This extends the lead-time to delivery and ruins benefits from a modular product architecture; specifically when the verifications are extensive and the frequency of detected defects is high. Due to the number of product variants obtained from the modular product architecture, verifications must handle a wide range of equipment, instructions and goal values to ensure that high quality products can be delivered. As a result, the total benefits from a modular product architecture are difficult to achieve. This thesis describes a method for planning and performing verifications within a modular product architecture. The method supports companies by utilizing the defined modules for verifications already at module level, so called MPV (Module Property Verification). With MPV, defects are detected at an earlier point, compared to verification of a complete product, and the number of verifications is decreased. The MPV method is built up of three phases. In Phase A, candidate modules are evaluated on the basis of costs and lead-time of the verifications and the repair of defects. An MPV-index is obtained which quantifies the module and indicates if the module should be verified at product level or by MPV. In Phase B, the interface interaction between the modules is evaluated, as well as the distribution of properties among the modules. The purpose is to evaluate the extent to which supplementary verifications at product level is needed. Phase C supports a selection of the final verification strategy. The cost and lead-time for the supplementary verifications are considered together with the results from Phase A and B. The MPV method is based on a set of qualitative and quantitative measures and tools which provide an overview and support the achievement of cost and time efficient company specific verifications. A practical application in industry shows how the MPV method can be used, and the subsequent benefits
Resumo:
Product verifications have become a cost-intensive and time-consuming aspect of modern electronics production, but with the onset of an ever-increasing miniaturisation, these aspects will become even more cumbersome. One may also go as far as to point out that certain precision assembly, such as within the biomedical sector, is legally bound to have 0 defects within production. Since miniaturisation and precision assembly will soon become a part of almost any product, the verifications phases of assembly need to be optimised in both functionality and cost. Another aspect relates to the stability and robustness of processes, a pre-requisite for flexibility. Furthermore, as the re-engineering cycle becomes ever more important, all information gathered within the ongoing process becomes vital. In view of these points, product, or process verification may be assumed to be an important and integral part of precision assembly. In this paper, product verification is defined as the process of determining whether or not the products, at a given phase in the life-cycle, fulfil the established specifications. Since the product is given its final form and function in the assembly, the product verification normally takes place somewhere in the assembly line which is the focus for this paper.
Resumo:
Background: Previous assessment methods for PG recognition used sensor mechanisms for PG that may cause discomfort. In order to avoid stress of applying wearable sensors, computer vision (CV) based diagnostic systems for PG recognition have been proposed. Main constraints in these methods are the laboratory setup procedures: Novel colored dresses for the patients were specifically designed to segment the test body from a specific colored background. Objective: To develop an image processing tool for home-assessment of Parkinson Gait(PG) by analyzing motion cues extracted during the gait cycles. Methods: The system is based on the idea that a normal body attains equilibrium during the gait by aligning the body posture with the axis of gravity. Due to the rigidity in muscular tone, persons with PD fail to align their bodies with the axis of gravity. The leaned posture of PD patients appears to fall forward. Whereas a normal posture exhibits a constant erect posture throughout the gait. Patients with PD walk with shortened stride angle (less than 15 degrees on average) with high variability in the stride frequency. Whereas a normal gait exhibits a constant stride frequency with an average stride angle of 45 degrees. In order to analyze PG, levodopa-responsive patients and normal controls were videotaped with several gait cycles. First, the test body is segmented in each frame of the gait video based on the pixel contrast from the background to form a silhouette. Next, the center of gravity of this silhouette is calculated. This silhouette is further skeletonized from the video frames to extract the motion cues. Two motion cues were stride frequency based on the cyclic leg motion and the lean frequency based on the angle between the leaned torso tangent and the axis of gravity. The differences in the peaks in stride and lean frequencies between PG and normal gait are calculated using Cosine Similarity measurements. Results: High cosine dissimilarity was observed in the stride and lean frequencies between PG and normal gait. High variations are found in the stride intervals of PG whereas constant stride intervals are found in the normal gait. Conclusions: We propose an algorithm as a source to eliminate laboratory constraints and discomfort during PG analysis. Installing this tool in a home computer with a webcam allows assessment of gait in the home environment.
Resumo:
In an attempt to find out which of the two Swedish prosodic contrasts of 1) wordstress pattern and 2) tonal word accent category has the greatest communicative weight, a lexical decision experiment was conducted: in one part word stress pattern was changed from trochaic to iambic, and in the other part trochaic accentII words were changed to accent I.Native Swedish listeners were asked to decide whether the distorted words werereal words or ‘non-words’. A clear tendency is that listeners preferred to give more‘non-word’ responses when the stress pattern was shifted, compared to when wordaccent category was shifted. This could have implications for priority of phonological features when teaching Swedish as a second language.
Resumo:
This thesis presents a system to recognise and classify road and traffic signs for the purpose of developing an inventory of them which could assist the highway engineers’ tasks of updating and maintaining them. It uses images taken by a camera from a moving vehicle. The system is based on three major stages: colour segmentation, recognition, and classification. Four colour segmentation algorithms are developed and tested. They are a shadow and highlight invariant, a dynamic threshold, a modification of de la Escalera’s algorithm and a Fuzzy colour segmentation algorithm. All algorithms are tested using hundreds of images and the shadow-highlight invariant algorithm is eventually chosen as the best performer. This is because it is immune to shadows and highlights. It is also robust as it was tested in different lighting conditions, weather conditions, and times of the day. Approximately 97% successful segmentation rate was achieved using this algorithm.Recognition of traffic signs is carried out using a fuzzy shape recogniser. Based on four shape measures - the rectangularity, triangularity, ellipticity, and octagonality, fuzzy rules were developed to determine the shape of the sign. Among these shape measures octangonality has been introduced in this research. The final decision of the recogniser is based on the combination of both the colour and shape of the sign. The recogniser was tested in a variety of testing conditions giving an overall performance of approximately 88%.Classification was undertaken using a Support Vector Machine (SVM) classifier. The classification is carried out in two stages: rim’s shape classification followed by the classification of interior of the sign. The classifier was trained and tested using binary images in addition to five different types of moments which are Geometric moments, Zernike moments, Legendre moments, Orthogonal Fourier-Mellin Moments, and Binary Haar features. The performance of the SVM was tested using different features, kernels, SVM types, SVM parameters, and moment’s orders. The average classification rate achieved is about 97%. Binary images show the best testing results followed by Legendre moments. Linear kernel gives the best testing results followed by RBF. C-SVM shows very good performance, but ?-SVM gives better results in some case.
Resumo:
The English language has become an international language and is globally used as a lingua franca. Therefore, there has been a shift in English-language education toward teaching English as an interna-tional language (EIL). Teaching from the EIL paradigm means that English is seen as an international language used in communication by people from different linguistic and cultural backgrounds. As the approach to English-language education changes from the traditional native-speaker, target country context, so does the role of culture within English-language teaching. The aim of this thesis is to in-vestigate and analyse cultural representations in two Swedish EFL textbooks used in upper-secondary school to see how they correspond with the EIL paradigm. This is done by focusing on the geograph-ical origin of the cultural content as well as looking at what kinds of culture are represented in the textbooks. A content analysis of the textbooks is conducted, using Kachru’s Concentric Circles of English as the model for the analysis of the geographical origin. Horibe’s model of the three different kinds of culture in EIL is the model used for coding the second part of the analysis. The results of the analysis show that culture of target countries and "Culture as social custom" dominate the cultural content of the textbook. Thus, although there are some indications that the EIL paradigm has influ-enced the textbooks, the traditional approach to culture in language teaching still prevails in the ana-lysed textbooks. Because of the relatively small sample included in the thesis, further studies need to be conducted in order to make conclusions regarding the Swedish context as a whole.