9 resultados para cross-language speaker recognition
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Tässä diplomityössä perehdytään puhujantunnistukseen ja sen käyttökelpoisuuteen käyttäjän henkilöllisyyden todentamisessa osana puhelinverkon lisäarvopalveluja. Puhelimitse ohjattavat palvelut ovat yleensä perustuneet puhelimen näppäimillä lähetettäviin äänitaajuusvalintoihin. Käyttäjän henkilöllisyydestä on voitu varmistua esimerkiksi käyttäjätunnuksen ja salaisen tunnusluvun perusteella. Tulevaisuudessa palvelut voivat perustua puheentunnistukseen, jolloin myös käyttäjän todentaminen äänen perusteella vaikuttaa järkevältä. Työssä esitellään aluksi erilaisia biometrisiä tunnistamismenetelmiä. Työssä perehdytään tarkemmin äänen perusteella tapahtuvaan puhujan todentamiseen. Työn käytännön osuudessa toteutettiin puhelinverkon palveluihin soveltuva puhujantodennussovelluksen prototyyppi. Työn tarkoituksena oli selvittää teknologian käyttömahdollisuuksia sekä kerätä kokemusta puhujantodennuspalvelun toteuttamisesta tulevaisuutta silmällä pitäen. Prototyypin toteutuksessa ohjelmointikielenä käytettiin Javaa.
Resumo:
Speaker diarization is the process of sorting speeches according to the speaker. Diarization helps to search and retrieve what a certain speaker uttered in a meeting. Applications of diarization systemsextend to other domains than meetings, for example, lectures, telephone, television, and radio. Besides, diarization enhances the performance of several speech technologies such as speaker recognition, automatic transcription, and speaker tracking. Methodologies previously used in developing diarization systems are discussed. Prior results and techniques are studied and compared. Methods such as Hidden Markov Models and Gaussian Mixture Models that are used in speaker recognition and other speech technologies are also used in speaker diarization. The objective of this thesis is to develop a speaker diarization system in meeting domain. Experimental part of this work indicates that zero-crossing rate can be used effectively in breaking down the audio stream into segments, and adaptive Gaussian Models fit adequately short audio segments. Results show that 35 Gaussian Models and one second as average length of each segment are optimum values to build a diarization system for the tested data. Uniting the segments which are uttered by same speaker is done in a bottom-up clustering by a newapproach of categorizing the mixture weights.
Resumo:
Summary : Fuzzy translation techniques in cross-language information retrieval between closely related languages
Resumo:
This thesis presents the results of an analysis of the content in the series of Russian textbooks Kafe Piter, which is widely used in Finnish educational institutions for adult learners at the time that the research is conducted. The purpose of this study is to determine and describe how a textbook may purvey an image of a foreign country (in this case, Russia). Mixed-methods research with a focus on the qualitative content analysis of Kafe Piter is performed. The guidelines for textbook evaluation of cultural content proposed by Byram (1993) are used in this study as the basis for creating a qualitative analysis checklist, which is adopted according to the needs of the current research. The selection of the categories in the checklist is based on major themes where direct statements about Russia, Russian people and culture appear in the textbook. The cultural content and the way in which it is presented in Kafe Piter are also compared to the intercultural competence objectives of the Common European Framework of Reference for Languages. Because the textbook was not written by a native Russian speaker, it was also important to investigate the types of mistakes found in the books. A simple quantitative analysis in the form of descriptive statistics was done, which consisted of counting the mistakes and inaccuracies in Kafe Piter. The mistakes were categorized into several different groups: factual or cultural, lexicosemantic, grammatical, spelling and punctuation mistakes. Based on the results, the cultural content of Kafe Piter provides a rich variety of cultural information that allows for a good understanding of the Russian language and Russian culture. A sufficient number of cross-cultural elements also appear in the textbook, including cultural images and information describing and comparing Russian and Finnish ways of life. Based on the cultural topics covered in Kafe Piter, we conclude that the textbook is in line with the intercultural competence objectives set out in the Common European Framework of Reference for Languages. The results of the study also make it clear that a thorough proofreading of Kafe Piter is needed in order to correct mistakes - more than 130 cultural and linguistic mistakes and inaccuracies appear in the textbook.
Resumo:
Recent advances in machine learning methods enable increasingly the automatic construction of various types of computer assisted methods that have been difficult or laborious to program by human experts. The tasks for which this kind of tools are needed arise in many areas, here especially in the fields of bioinformatics and natural language processing. The machine learning methods may not work satisfactorily if they are not appropriately tailored to the task in question. However, their learning performance can often be improved by taking advantage of deeper insight of the application domain or the learning problem at hand. This thesis considers developing kernel-based learning algorithms incorporating this kind of prior knowledge of the task in question in an advantageous way. Moreover, computationally efficient algorithms for training the learning machines for specific tasks are presented. In the context of kernel-based learning methods, the incorporation of prior knowledge is often done by designing appropriate kernel functions. Another well-known way is to develop cost functions that fit to the task under consideration. For disambiguation tasks in natural language, we develop kernel functions that take account of the positional information and the mutual similarities of words. It is shown that the use of this information significantly improves the disambiguation performance of the learning machine. Further, we design a new cost function that is better suitable for the task of information retrieval and for more general ranking problems than the cost functions designed for regression and classification. We also consider other applications of the kernel-based learning algorithms such as text categorization, and pattern recognition in differential display. We develop computationally efficient algorithms for training the considered learning machines with the proposed kernel functions. We also design a fast cross-validation algorithm for regularized least-squares type of learning algorithm. Further, an efficient version of the regularized least-squares algorithm that can be used together with the new cost function for preference learning and ranking tasks is proposed. In summary, we demonstrate that the incorporation of prior knowledge is possible and beneficial, and novel advanced kernels and cost functions can be used in algorithms efficiently.
Resumo:
Human activity recognition in everyday environments is a critical, but challenging task in Ambient Intelligence applications to achieve proper Ambient Assisted Living, and key challenges still remain to be dealt with to realize robust methods. One of the major limitations of the Ambient Intelligence systems today is the lack of semantic models of those activities on the environment, so that the system can recognize the speci c activity being performed by the user(s) and act accordingly. In this context, this thesis addresses the general problem of knowledge representation in Smart Spaces. The main objective is to develop knowledge-based models, equipped with semantics to learn, infer and monitor human behaviours in Smart Spaces. Moreover, it is easy to recognize that some aspects of this problem have a high degree of uncertainty, and therefore, the developed models must be equipped with mechanisms to manage this type of information. A fuzzy ontology and a semantic hybrid system are presented to allow modelling and recognition of a set of complex real-life scenarios where vagueness and uncertainty are inherent to the human nature of the users that perform it. The handling of uncertain, incomplete and vague data (i.e., missing sensor readings and activity execution variations, since human behaviour is non-deterministic) is approached for the rst time through a fuzzy ontology validated on real-time settings within a hybrid data-driven and knowledgebased architecture. The semantics of activities, sub-activities and real-time object interaction are taken into consideration. The proposed framework consists of two main modules: the low-level sub-activity recognizer and the high-level activity recognizer. The rst module detects sub-activities (i.e., actions or basic activities) that take input data directly from a depth sensor (Kinect). The main contribution of this thesis tackles the second component of the hybrid system, which lays on top of the previous one, in a superior level of abstraction, and acquires the input data from the rst module's output, and executes ontological inference to provide users, activities and their in uence in the environment, with semantics. This component is thus knowledge-based, and a fuzzy ontology was designed to model the high-level activities. Since activity recognition requires context-awareness and the ability to discriminate among activities in di erent environments, the semantic framework allows for modelling common-sense knowledge in the form of a rule-based system that supports expressions close to natural language in the form of fuzzy linguistic labels. The framework advantages have been evaluated with a challenging and new public dataset, CAD-120, achieving an accuracy of 90.1% and 91.1% respectively for low and high-level activities. This entails an improvement over both, entirely data-driven approaches, and merely ontology-based approaches. As an added value, for the system to be su ciently simple and exible to be managed by non-expert users, and thus, facilitate the transfer of research to industry, a development framework composed by a programming toolbox, a hybrid crisp and fuzzy architecture, and graphical models to represent and con gure human behaviour in Smart Spaces, were developed in order to provide the framework with more usability in the nal application. As a result, human behaviour recognition can help assisting people with special needs such as in healthcare, independent elderly living, in remote rehabilitation monitoring, industrial process guideline control, and many other cases. This thesis shows use cases in these areas.
Resumo:
Finnish companies cross listing in the United States is an exceptional phenomenon. This study examines the cross listing decision, cross listing choice and cross listing process with associated challenges and critical factors. The aim is to create an in-depth understanding of the cross listing process and the required financial information. Based on that, the aim is to establish the process phases with the challenges and the critical factors that ought to be considered be- fore establishing the process plus re-evaluated and further considered at points in time during the process. The empirical part of this study is conducted as a qualitative study. The research data was collected through the adoption of two approaches, which are the interview approach and the textual data approach. The interviews were conducted with Finnish practitioners in the field of accounting and finance. The textual data was from publicly available publications of this phenomenon by the two BIG5 accounting companies worldwide. The results of this study demonstrate the benefits of cross listing in the U.S. are the better growth opportunities, the reduction of cost of capital and the production of higher quality financial information. In the decision making process companies should assess whether the benefits exceed the increased costs, the pressure for performance, the uncertainty of market recognition and the requirements of management. The exchange listing is seen as the most favourable cross listing choice for Finnish companies. The establishment of the processes for producing reliable, transparent and timely financial information was seen as both highly critical and very challenging. The critical success factors relating to the cross listing phases are the assessment and planning as well as the right mix of experiences and expertise. The timing plays important role in the process. The results mainly corroborate the literature concerning cross listing decision and choice. This study contributes to the literature on the cross listing process offering a useful model for the phases of the cross listing process.
Resumo:
Metal-ion-mediated base-pairing of nucleic acids has attracted considerable attention during the past decade, since it offers means to expand the genetic code by artificial base-pairs, to create predesigned molecular architecture by metal-ion-mediated inter- or intra-strand cross-links, or to convert double stranded DNA to a nano-scale wire. Such applications largely depend on the presence of a modified nucleobase in both strands engaged in the duplex formation. Hybridization of metal-ion-binding oligonucleotide analogs with natural nucleic acid sequences has received much less attention in spite of obvious applications. While the natural oligonucleotides hybridize with high selectivity, their affinity for complementary sequences is inadequate for a number of applications. In the case of DNA, for example, more than 10 consecutive Watson-Crick base pairs are required for a stable duplex at room temperature, making targeting of sequences shorter than this challenging. For example, many types of cancer exhibit distinctive profiles of oncogenic miRNA, the diagnostics of which is, however, difficult owing to the presence of only short single stranded loop structures. Metallo-oligonucleotides, with their superior affinity towards their natural complements, would offer a way to overcome the low stability of short duplexes. In this study a number of metal-ion-binding surrogate nucleosides were prepared and their interaction with nucleoside 5´-monophosphates (NMPs) has been investigated by 1H NMR spectroscopy. To find metal ion complexes that could discriminate between natural nucleobases upon double helix formation, glycol nucleic acid (GNA) sequences carrying a PdII ion with vacant coordination sites at a predetermined position were synthesized and their affinity to complementary as well as mismatched counterparts quantified by UV-melting measurements.