5 resultados para Syntactic And Semantic Comprehension Tasks
em AMS Tesi di Laurea - Alm@DL - Università di Bologna
Resumo:
Artificial Intelligence is reshaping the field of fashion industry in different ways. E-commerce retailers exploit their data through AI to enhance their search engines, make outfit suggestions and forecast the success of a specific fashion product. However, it is a challenging endeavour as the data they possess is huge, complex and multi-modal. The most common way to search for fashion products online is by matching keywords with phrases in the product's description which are often cluttered, inadequate and differ across collections and sellers. A customer may also browse an online store's taxonomy, although this is time-consuming and doesn't guarantee relevant items. With the advent of Deep Learning architectures, particularly Vision-Language models, ad-hoc solutions have been proposed to model both the product image and description to solve this problems. However, the suggested solutions do not exploit effectively the semantic or syntactic information of these modalities, and the unique qualities and relations of clothing items. In this work of thesis, a novel approach is proposed to address this issues, which aims to model and process images and text descriptions as graphs in order to exploit the relations inside and between each modality and employs specific techniques to extract syntactic and semantic information. The results obtained show promising performances on different tasks when compared to the present state-of-the-art deep learning architectures.
Resumo:
Nowadays communication is switching from a centralized scenario, where communication media like newspapers, radio, TV programs produce information and people are just consumers, to a completely different decentralized scenario, where everyone is potentially an information producer through the use of social networks, blogs, forums that allow a real-time worldwide information exchange. These new instruments, as a result of their widespread diffusion, have started playing an important socio-economic role. They are the most used communication media and, as a consequence, they constitute the main source of information enterprises, political parties and other organizations can rely on. Analyzing data stored in servers all over the world is feasible by means of Text Mining techniques like Sentiment Analysis, which aims to extract opinions from huge amount of unstructured texts. This could lead to determine, for instance, the user satisfaction degree about products, services, politicians and so on. In this context, this dissertation presents new Document Sentiment Classification methods based on the mathematical theory of Markov Chains. All these approaches bank on a Markov Chain based model, which is language independent and whose killing features are simplicity and generality, which make it interesting with respect to previous sophisticated techniques. Every discussed technique has been tested in both Single-Domain and Cross-Domain Sentiment Classification areas, comparing performance with those of other two previous works. The performed analysis shows that some of the examined algorithms produce results comparable with the best methods in literature, with reference to both single-domain and cross-domain tasks, in $2$-classes (i.e. positive and negative) Document Sentiment Classification. However, there is still room for improvement, because this work also shows the way to walk in order to enhance performance, that is, a good novel feature selection process would be enough to outperform the state of the art. Furthermore, since some of the proposed approaches show promising results in $2$-classes Single-Domain Sentiment Classification, another future work will regard validating these results also in tasks with more than $2$ classes.
Resumo:
Over the last few years, the massive popularity of video streaming platforms has managed to impact our daily habits by making the watching of movies and TV shows one of the main activities of our free time. By providing a wide range of foreign language audiovisual content, these entertainment services may represent a powerful resource for language learners, as they provide them with the possibility to be exposed to authentic input. Moreover, research has shown the beneficial role of audiovisual textual aids such as native language subtitles and target language captions in enhancing language skills such as vocabulary and listening comprehension. The aim of this thesis is to analyze the existing literature on the subject of subtitled and captioned audiovisual materials used as a pedagogical tool for informal language learning.
Resumo:
In the collective imaginaries a robot is a human like machine as any androids in science fiction. However the type of robots that you will encounter most frequently are machinery that do work that is too dangerous, boring or onerous. Most of the robots in the world are of this type. They can be found in auto, medical, manufacturing and space industries. Therefore a robot is a system that contains sensors, control systems, manipulators, power supplies and software all working together to perform a task. The development and use of such a system is an active area of research and one of the main problems is the development of interaction skills with the surrounding environment, which include the ability to grasp objects. To perform this task the robot needs to sense the environment and acquire the object informations, physical attributes that may influence a grasp. Humans can solve this grasping problem easily due to their past experiences, that is why many researchers are approaching it from a machine learning perspective finding grasp of an object using information of already known objects. But humans can select the best grasp amongst a vast repertoire not only considering the physical attributes of the object to grasp but even to obtain a certain effect. This is why in our case the study in the area of robot manipulation is focused on grasping and integrating symbolic tasks with data gained through sensors. The learning model is based on Bayesian Network to encode the statistical dependencies between the data collected by the sensors and the symbolic task. This data representation has several advantages. It allows to take into account the uncertainty of the real world, allowing to deal with sensor noise, encodes notion of causality and provides an unified network for learning. Since the network is actually implemented and based on the human expert knowledge, it is very interesting to implement an automated method to learn the structure as in the future more tasks and object features can be introduced and a complex network design based only on human expert knowledge can become unreliable. Since structure learning algorithms presents some weaknesses, the goal of this thesis is to analyze real data used in the network modeled by the human expert, implement a feasible structure learning approach and compare the results with the network designed by the expert in order to possibly enhance it.
Resumo:
This study aims at exploring listeners’ perception of disfluencies, i.e. ungrammatical pauses, filled pauses, repairs, false starts and repetitions, which can irritate listeners and impede comprehension. As professional communicators, conference interpreters should be competent public speakers. This means that their speech should be easily understood by listeners and not contain elements that may be considered irritating. The aim of this study was to understand to what extent listeners notice disfluencies and consider them irritating, and to examine whether there are differences between interpreters and non-interpreters and between different age groups. A survey was therefore carried out among professional interpreters, students of interpreting and people who regularly attend conferences. The respondents were asked to answer a questionnaire after listening to three speeches: three consecutive interpretations delivered during the final exams held at the Advanced School of Languages, Literature, Translation and Interpretation (SSLLTI) in Forlì. Since conference interpreters’ public speaking skills should be at least as good as those of the speakers at a conference, the speeches were presented to the listeners as speeches delivered during a conference, with no mention of interpreting being made. The study is divided into five chapters. Chapter I outlines the characteristics of the interpreter as a professional communicator. The quality criterion “user-friendliness” is explored, with a focus on features that make a speech more user-friendly: fluency, intonation, coherence and cohesion. The Chapter also focuses on listeners’ quality expectations and evaluations. In Chapter II the methodology of the study is described. Chapter III contains a detailed analysis of the texts used for the study, focusing on those elements that may irritate listeners or impede comprehension, namely disfluencies, the wrong use of intonation and a lack of coherence or cohesion. Chapter IV outlines the results of the survey, while Chapter V presents our conclusions.