10 resultados para Speaker Recognition, Text-constrained, Multilingual, Speaker Verification, HMMs
em AMS Tesi di Laurea - Alm@DL - Università di Bologna
Resumo:
The recording and processing of voice data raises increasing privacy concerns for users and service providers. One way to address these issues is to move processing on the edge device closer to the recording so that potentially identifiable information is not transmitted over the internet. However, this is often not possible due to hardware limitations. An interesting alternative is the development of voice anonymization techniques that remove individual speakers characteristics while preserving linguistic and acoustic information in the data. In this work, a state-of-the-art approach to sequence-to-sequence speech conversion, ini- tially based on x-vectors and bottleneck features for automatic speech recognition, is explored to disentangle the two acoustic information using different pre-trained speech and speakers representation. Furthermore, different strategies for selecting target speech representations are analyzed. Results on public datasets in terms of equal error rate and word error rate show that good privacy is achieved with limited impact on converted speech quality relative to the original method.
Resumo:
Ontology design and population -core aspects of semantic technologies- re- cently have become fields of great interest due to the increasing need of domain-specific knowledge bases that can boost the use of Semantic Web. For building such knowledge resources, the state of the art tools for ontology design require a lot of human work. Producing meaningful schemas and populating them with domain-specific data is in fact a very difficult and time-consuming task. Even more if the task consists in modelling knowledge at a web scale. The primary aim of this work is to investigate a novel and flexible method- ology for automatically learning ontology from textual data, lightening the human workload required for conceptualizing domain-specific knowledge and populating an extracted schema with real data, speeding up the whole ontology production process. Here computational linguistics plays a fundamental role, from automati- cally identifying facts from natural language and extracting frame of relations among recognized entities, to producing linked data with which extending existing knowledge bases or creating new ones. In the state of the art, automatic ontology learning systems are mainly based on plain-pipelined linguistics classifiers performing tasks such as Named Entity recognition, Entity resolution, Taxonomy and Relation extraction [11]. These approaches present some weaknesses, specially in capturing struc- tures through which the meaning of complex concepts is expressed [24]. Humans, in fact, tend to organize knowledge in well-defined patterns, which include participant entities and meaningful relations linking entities with each other. In literature, these structures have been called Semantic Frames by Fill- 6 Introduction more [20], or more recently as Knowledge Patterns [23]. Some NLP studies has recently shown the possibility of performing more accurate deep parsing with the ability of logically understanding the structure of discourse [7]. In this work, some of these technologies have been investigated and em- ployed to produce accurate ontology schemas. The long-term goal is to collect large amounts of semantically structured information from the web of crowds, through an automated process, in order to identify and investigate the cognitive patterns used by human to organize their knowledge.
Resumo:
It has recently been noticed that interpreters tend to converge with their speakers’ emotions under a process known as emotional contagion. Emotional contagion still represents an underinvestigated aspect of interpreting and the few studies on this topic have tended to focus more on simultaneous interpreting rather than consecutive interpreting. Korpal & Jasielska (2019) compared the emotional effects of one emotional and one neutral text on interpreters in simultaneous interpreting and found that interpreters tended to converge emotionally with the speaker more when interpreting the emotional text. This exploratory study follows their procedures to study the emotional contagion potentially caused by two texts among interpreters in consecutive interpreting: one emotionally neutral text and one negatively-valenced text, this last containing 44 negative words as triggers. Several measures were triangulated to determine whether the triggers in the negatively-valenced text could prompt a stronger emotional contagion in the consecutive interpreting of that text as compared to the consecutive interpreting of the emotionally neutral text, which contained no triggers—namely, the quality of the interpreters’ delivery; their heart rate variability values as collected with EMPATICA E4 wristbands; the analysis of their acoustic variations (i.e., disfluencies and rhetorical strategies); their linguistic and emotional management of the triggers; and their answers to the Italian version of the Positive and Negative Affect Schedule (PANAS) self-report questionnaire. Results showed no statistically significant evidence of an emotional contagion evoked by the triggers in the consecutive interpreting of the negative text as opposed to the consecutive interpreting of the neutral text. On the contrary, interpreters seemed to be more at ease while interpreting the negative text. This surprising result, together with other results of this project, suggests venues for further research.
Resumo:
In recent years, we have witnessed great changes in the industrial environment as a result of the innovations introduced by Industry 4.0, especially in the integration of Internet of Things, Automation and Robotics in the manufacturing field. The project presented in this thesis lies within this innovation context and describes the implementation of an Image Recognition application focused on the automotive field. The project aims at helping the supply chain operator to perform an effective and efficient check of the homologation tags present on vehicles. The user contribution consists in taking a picture of the tag and the application will automatically, exploiting Amazon Web Services, return the result of the control about the correctness of the tag, the correct positioning within the vehicle and the presence of faults or defects on the tag. To implement this application we ombined two IoT platforms widely used in industrial field: Amazon Web Services(AWS) and ThingWorx. AWS exploits Convolutional Neural Networks to perform Text Detection and Image Recognition, while PTC ThingWorx manages the user interface and the data manipulation.
Resumo:
With the increase in load demand for various sectors, protection and safety of the network are key factors that have to be taken into consideration over the electric grid and distribution network. A phasor Measuring unit is an Intelligent electronics device that collects the data in the form of a real-time synchrophasor with a precise time tag using GPS (Global positioning system) and transfers the data to the grid command to monitor and assess the data. The measurements made by PMU have to be very precise to protect the relays and measuring equipment according to the IEEE 60255-118-1(2018). As a device PMU is very expensive to research and develop new functionalities there is a need to find an alternative to working with. Hence many open source virtual libraries are available to replicate the exact function of PMU in the virtual environment(Software) to continue the research on multiple objectives, providing the very least error results when verified. In this thesis, I executed performance and compliance verification of the virtual PMU which was developed using the I-DFT (Interpolated Discrete Fourier transforms) C-class algorithm in MATLAB. In this thesis, a test environment has been developed in MATLAB and tested the virtually developed PMU on both steady state and dynamic state for verifying the latest standard compliance(IEEE-60255-118-1).
Resumo:
In this thesis we address a multi-label hierarchical text classification problem in a low-resource setting and explore different approaches to identify the best one for our case. The goal is to train a model that classifies English school exercises according to a hierarchical taxonomy with few labeled data. The experiments made in this work employ different machine learning models and text representation techniques: CatBoost with tf-idf features, classifiers based on pre-trained models (mBERT, LASER), and SetFit, a framework for few-shot text classification. SetFit proved to be the most promising approach, achieving better performance when during training only a few labeled examples per class are available. However, this thesis does not consider all the hierarchical taxonomy, but only the first two levels: to address classification with the classes at the third level further experiments should be carried out, exploring methods for zero-shot text classification, data augmentation, and strategies to exploit the hierarchical structure of the taxonomy during training.
Resumo:
Nowadays, some activities, such as subscribing an insurance policy or opening a bank account, are possible by navigating through a web page or a downloadable application. Since the user is often “hidden” behind a monitor or a smartphone, it is necessary a solution able to guarantee about their identity. Companies are often requiring the submission of a “proof-of-identity”, which usually consists in a picture of an identity document of the user, together with a picture or a brief video of themselves. This work describes a system whose purpose is the automation of these kinds of verifications.
Resumo:
The usage of Optical Character Recognition’s (OCR, systems is a widely spread technology into the world of Computer Vision and Machine Learning. It is a topic that interest many field, for example the automotive, where becomes a specialized task known as License Plate Recognition, useful for many application from the automation of toll road to intelligent payments. However, OCR systems need to be very accurate and generalizable in order to be able to extract the text of license plates under high variable conditions, from the type of camera used for acquisition to light changes. Such variables compromise the quality of digitalized real scenes causing the presence of noise and degradation of various type, which can be minimized with the application of modern approaches for image iper resolution and noise reduction. Oneclass of them is known as Generative Neural Networks, which are very strong ally for the solution of this popular problem.
Resumo:
Recognition of everyday human activity through mobile personal sensing technology plays a central role in the field of pervasive healthcare. The Bologna-based American company eSteps Inc. addresses the growing motor disability of the lower limbs by offering pre-, during and post-hospitalisation monitoring solutions with biomechanics and telerehabilitation protocol. It has developed a smart, customised and sustainable device to monitor motor activity, fatigue and injury risk for patients and a special app to share data with caregivers and medical specialists. The objective of this study is the development of an Artificial Intelligence model to recognize the activity performed by a person with Multiple Sclerosis or a healthy person through eSteps devices.
Resumo:
The primary goal of this thesis is to verify the rupture disc sizing of the acrylic reactor. Primarily the test to check the sizing was divided into several stages. It went on to examine ideas to explain the concern and ethical ways, as well as remedies and suggestions to solve the issues and difficulties that were discovered. This thesis will highlight the gathering and arranging of reaction data (recipe composition, enthalpies, reaction temperature, and catalyst feeding times) of the products to be chosen, in accordance with pre-established criteria. To collaborate with the research and development team in the lab to carry out calorimetric testing for the important recipes that have been identified. The verification of the currently installed Rupture Discs in the plant based on the calorimetric test findings is the final stage. This thesis used two separate calorimetry techniques: Phi-TEC II adiabatic calorimetry and differential scanning calorimetry (DSC). The target of the experiment is to check and confirm the correct size of the reactor rupture disc. Arkema (Boretto/Coatex) plant (Emilia romagna) provided a recipe and a scenario following multiple meetings and discussions. The purpose of this technical paper is to describe the outcomes of adiabatic calorimetry performed at the lab scale so that the computation of the vents for a particular recipe and scenario can be verified.