9 resultados para Query Languages
em AMS Tesi di Laurea - Alm@DL - Università di Bologna
Resumo:
The central objective of research in Information Retrieval (IR) is to discover new techniques to retrieve relevant information in order to satisfy an Information Need. The Information Need is satisfied when relevant information can be provided to the user. In IR, relevance is a fundamental concept which has changed over time, from popular to personal, i.e., what was considered relevant before was information for the whole population, but what is considered relevant now is specific information for each user. Hence, there is a need to connect the behavior of the system to the condition of a particular person and his social context; thereby an interdisciplinary sector called Human-Centered Computing was born. For the modern search engine, the information extracted for the individual user is crucial. According to the Personalized Search (PS), two different techniques are necessary to personalize a search: contextualization (interconnected conditions that occur in an activity), and individualization (characteristics that distinguish an individual). This movement of focus to the individual's need undermines the rigid linearity of the classical model overtaken the ``berry picking'' model which explains that the terms change thanks to the informational feedback received from the search activity introducing the concept of evolution of search terms. The development of Information Foraging theory, which observed the correlations between animal foraging and human information foraging, also contributed to this transformation through attempts to optimize the cost-benefit ratio. This thesis arose from the need to satisfy human individuality when searching for information, and it develops a synergistic collaboration between the frontiers of technological innovation and the recent advances in IR. The search method developed exploits what is relevant for the user by changing radically the way in which an Information Need is expressed, because now it is expressed through the generation of the query and its own context. As a matter of fact the method was born under the pretense to improve the quality of search by rewriting the query based on the contexts automatically generated from a local knowledge base. Furthermore, the idea of optimizing each IR system has led to develop it as a middleware of interaction between the user and the IR system. Thereby the system has just two possible actions: rewriting the query, and reordering the result. Equivalent actions to the approach was described from the PS that generally exploits information derived from analysis of user behavior, while the proposed approach exploits knowledge provided by the user. The thesis went further to generate a novel method for an assessment procedure, according to the "Cranfield paradigm", in order to evaluate this type of IR systems. The results achieved are interesting considering both the effectiveness achieved and the innovative approach undertaken together with the several applications inspired using a local knowledge base.
Resumo:
I sistemi mobili rappresentano una classe di sistemi distribuiti caratterizzata dalla presenza di dispositivi portatili eterogenei quali PDA, laptop e telefoni cellulari che interagiscono tra loro mediante una rete di interconnessione wireless. Una classe di sistemi mobili di particolare interesse è costituita dai sistemi basati sul modello di interazione publish/subscribe. Secondo tale schema, i nodi all'interno di una rete possono assumere due ruoli differenti: i produttori di informazione, chiamati publisher, ed i consumatori di informazione, chiamati subscriber. Tipicamente, l'interazione tra essi è mediata da un gestore di eventi che indirizza correttamente le informazioni ricevute dai publisher verso i subscriber interessati, sulla base degli interessi espressi da questi ultimi tramite sottoscrizioni. Nella progettazione di sistemi mobili, a differenza di quelli tradizionali basati su nodi fissi, bisogna tenere conto di problemi quali la scarsa capacità computazionale dei dispositivi e la limitata larghezza di banda delle reti wireless. All'interno di tale ambito, stanno recentemente assumendo sempre maggiore importanza i sistemi context-aware, ovvero sistemi mobili che sfruttano i dati provenienti dall'ambiente circostante e dai dispositivi stessi per adattare il proprio comportamento e notificare agli utenti la presenza di informazioni potenzialmente utili. Nello studio di questi sistemi, si è notato che i nodi che si trovano nella stessa area geografica generano tipicamente delle sottoscrizioni che presentano tra loro un certo grado di similarità e coperture parziali o totali. Il gruppo di ricerca del DEIS dell’Università di Bologna ha sviluppato un'infrastruttura di supporto per sistemi mobili context-aware, chiamata SALES. Attualmente il sistema progettato non considera le similarità delle sottoscrizioni e quindi non sfrutta opportunamente tale informazione. In questo contesto si rende necessario l'adozione di opportune tecniche di aggregazione delle sottoscrizioni atte ad alleggerire la computazione dei nodi mobili e le comunicazioni tra loro. Il lavoro presentato in questa tesi sarà finalizzato alla ricerca, all'adattamento ed all'implementazione di una tecnica di aggregazione delle sottoscrizioni. Tale tecnica avrà lo scopo di rilevare e sfruttare le similarità delle sottoscrizioni ricevute dal sistema al fine di ridurne il numero; in questo modo, quando un nodo riceverà un dato, il processo di confronto tra l'insieme delle sottoscrizioni memorizzate e il dato ricevuto sarà più leggero, consentendo un risparmio di risorse computazionali. Inoltre, adattando tali tecniche, sarà possibile modulare anche il traffico dati scaturito dalle risposte alle sottoscrizioni. La struttura di questa tesi prevede un primo capitolo sui sistemi context-aware, descrivendone le principali caratteristiche e mettendo in luce le problematiche ad essi associate. Il secondo capitolo illustra il modello di comunicazione Publish/Subscribe, modello di riferimento per i moderni sistemi context-aware e per i sistemi mobili in generale. Il terzo capitolo descrive l'infrastruttura SALES sulla quale si è progettata, implementata e testata la soluzione proposta in questa tesi. Il quarto capitolo presenta le principali tecniche di aggregazione delle sottoscrizioni e spiega come possono essere adattate al contesto di questa tesi. Il quinto capitolo effettua l'analisi dei requisiti per comprendere meglio il comportamento della soluzione; seguono la progettazione e l’implementazione della soluzione su SALES. Infine, il sesto capitolo riporta in dettaglio i risultati ottenuti da alcuni degli esperimenti effettuati e vengono messi a confronto con quelli rilevati dal sistema di partenza.
Resumo:
In computer systems, specifically in multithread, parallel and distributed systems, a deadlock is both a very subtle problem - because difficult to pre- vent during the system coding - and a very dangerous one: a deadlocked system is easily completely stuck, with consequences ranging from simple annoyances to life-threatening circumstances, being also in between the not negligible scenario of economical losses. Then, how to avoid this problem? A lot of possible solutions has been studied, proposed and implemented. In this thesis we focus on detection of deadlocks with a static program analysis technique, i.e. an analysis per- formed without actually executing the program. To begin, we briefly present the static Deadlock Analysis Model devel- oped for coreABS−− in chapter 1, then we proceed by detailing the Class- based coreABS−− language in chapter 2. Then, in Chapter 3 we lay the foundation for further discussions by ana- lyzing the differences between coreABS−− and ASP, an untyped Object-based calculi, so as to show how it can be possible to extend the Deadlock Analysis to Object-based languages in general. In this regard, we explicit some hypotheses in chapter 4 first by present- ing a possible, unproven type system for ASP, modeled after the Deadlock Analysis Model developed for coreABS−−. Then, we conclude our discussion by presenting a simpler hypothesis, which may allow to circumvent the difficulties that arises from the definition of the ”ad-hoc” type system discussed in the aforegoing chapter.
Resumo:
La tesi presenta una serie di risultati dell'analisi quantitativa sulla linguistica. Inizialmente sono studiate due fra le leggi empiriche più famose di questo campo, le leggi di Zipf e Heaps, e vengono esposti vari modelli sullo sviluppo del linguaggio. Nella seconda parte si giunge alla discussione di risultati più specifici sulla presenza di fenomeni di burstiness e di correlazioni a lungo raggio nei testi. Tutti questi studi teorici sono affiancati da analisi sperimentali, svolte utilizzando varie traduzioni del libro "Guerra e pace" di Leo Tolstoj e concentrate principalmente sulle eventuali differenze riscontrabili tra le diverse lingue.
Resumo:
Ireland is a country in which two languages are spoken: English and Irish. This thesis analyzes the historical relationship between the languages, the cultural codes and meanings attached to each of them, as well as how much of the culture of its speakers each is able to carry. Beyond that, the influence the two languages have exercised on one another and their mutual entwinement is taken into closer examination.
Resumo:
Nowadays, modern society is gradually becoming multicultural. However, only in the last few years awareness on its importance has been raised. In the case of Colombia, multiculturalism has existed since the pre-Columbian period and today there are more than 80 ethnic groups and 65 indigenous languages in the country. The aim of this work is to illustrate the status of indigenous languages in Colombia and to enlighten about the importance of recognizing, protecting and strengthening the use of these native languages. Subsequent to this, it will be point out that linguistic diversity should be considered a resource and not a barrier to achieve unity in diversity. Finally, ethno-education will be presented as an adequate educational program that may guarantee an equal linguistic representation in the country.
Resumo:
Most of the existing open-source search engines, utilize keyword or tf-idf based techniques to find relevant documents and web pages relative to an input query. Although these methods, with the help of a page rank or knowledge graphs, proved to be effective in some cases, they often fail to retrieve relevant instances for more complicated queries that would require a semantic understanding to be exploited. In this Thesis, a self-supervised information retrieval system based on transformers is employed to build a semantic search engine over the library of Gruppo Maggioli company. Semantic search or search with meaning can refer to an understanding of the query, instead of simply finding words matches and, in general, it represents knowledge in a way suitable for retrieval. We chose to investigate a new self-supervised strategy to handle the training of unlabeled data based on the creation of pairs of ’artificial’ queries and the respective positive passages. We claim that by removing the reliance on labeled data, we may use the large volume of unlabeled material on the web without being limited to languages or domains where labeled data is abundant.
Resumo:
During the last semester of the Master’s Degree in Artificial Intelligence, I carried out my internship working for TXT e-Solution on the ADMITTED project. This paper describes the work done in those months. The thesis will be divided into two parts representing the two different tasks I was assigned during the course of my experience. The First part will be about the introduction of the project and the work done on the admittedly library, maintaining the code base and writing the test suits. The work carried out is more connected to the Software engineer role, developing features, fixing bugs and testing. The second part will describe the experiments done on the Anomaly detection task using a Deep Learning technique called Autoencoder, this task is on the other hand more connected to the data science role. The two tasks were not done simultaneously but were dealt with one after the other, which is why I preferred to divide them into two separate parts of this paper.