19 resultados para programming language processing

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last decades, Artificial Intelligence has witnessed multiple breakthroughs in deep learning. In particular, purely data-driven approaches have opened to a wide variety of successful applications due to the large availability of data. Nonetheless, the integration of prior knowledge is still required to compensate for specific issues like lack of generalization from limited data, fairness, robustness, and biases. In this thesis, we analyze the methodology of integrating knowledge into deep learning models in the field of Natural Language Processing (NLP). We start by remarking on the importance of knowledge integration. We highlight the possible shortcomings of these approaches and investigate the implications of integrating unstructured textual knowledge. We introduce Unstructured Knowledge Integration (UKI) as the process of integrating unstructured knowledge into machine learning models. We discuss UKI in the field of NLP, where knowledge is represented in a natural language format. We identify UKI as a complex process comprised of multiple sub-processes, different knowledge types, and knowledge integration properties to guarantee. We remark on the challenges of integrating unstructured textual knowledge and bridge connections with well-known research areas in NLP. We provide a unified vision of structured knowledge extraction (KE) and UKI by identifying KE as a sub-process of UKI. We investigate some challenging scenarios where structured knowledge is not a feasible prior assumption and formulate each task from the point of view of UKI. We adopt simple yet effective neural architectures and discuss the challenges of such an approach. Finally, we identify KE as a form of symbolic representation. From this perspective, we remark on the need of defining sophisticated UKI processes to verify the validity of knowledge integration. To this end, we foresee frameworks capable of combining symbolic and sub-symbolic representations for learning as a solution.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The general aim of the thesis was to investigate how and to what extent the characteristics of action organization are reflected in language, and how they influence language processing and understanding. Even though a huge amount of research has been devoted to the study of the motor effects of language, this issue is very debated in literature. Namely, the majority of the studies have focused on low-level motor effects such as effector-relatedness of action, whereas only a few studies have started to systematically investigate how specific aspects of action organization are encoded and reflected in language. After a review of previous studies on the relationship between language comprehension and action (chapter 1) and a critical discussion of some of them (chapter 2), the thesis is composed by three experimental chapters, each devoted to a specific aspect of action organization. Chapter 3 presents a study designed with the aim to disentangle the effective time course of the involvement of the motor system during language processing. Three kinematics experiments were designed in order to determine whether and, at which stage of motor planning and execution effector-related action verbs influence actions executed with either the same or a different effector. Results demonstrate that the goal of an action can be linguistically re-activated, producing a modulation of the motor response. In chapter 4, a second study investigates the interplay between the role of motor perspective (agent) and the organization of action in motor chains. More specifically, this kinematics study aims at deepening how goal can be translated in language, using as stimuli simple sentences composed by a pronoun (I, You, He/She) and a verb. Results showed that the perspective activated by the pronoun You reflects the motor pattern of the “agent” combined with the chain structure of the verb. These data confirm an early involvement of the motor system in language processing, suggesting that it is specifically modulated by the activation of the agent’s perspective. In chapter 5, the issue of perspective is specifically investigated, focusing on its role in language comprehension. In particular, this study aimed at determining how a specific perspective (induced for example by a personal pronoun) modulates motor behaviour during and after language processing. A classical compatibility effect (the Action-sentence compatibility effect) has been used to this aim. In three behavioural experiments the authors investigated how the ACE is modulated by taking first or third person perspective. Results from these experiments showed that the ACE effect occurs only when a first-person perspective is activated by the sentences used as stimuli. Overall, the data from this thesis contributed to disentangle several aspects of how action organization is translated in language, and then reactivated during language processing. This constitutes a new contribution to the field, adding lacking information on how specific aspects such as goal and perspective are linguistically described. In addition, these studies offer a new point of view to understand the functional implications of the involvement of the motor system during language comprehension, specifically from the point of view of our social interactions.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Mainstream hardware is becoming parallel, heterogeneous, and distributed on every desk, every home and in every pocket. As a consequence, in the last years software is having an epochal turn toward concurrency, distribution, interaction which is pushed by the evolution of hardware architectures and the growing of network availability. This calls for introducing further abstraction layers on top of those provided by classical mainstream programming paradigms, to tackle more effectively the new complexities that developers have to face in everyday programming. A convergence it is recognizable in the mainstream toward the adoption of the actor paradigm as a mean to unite object-oriented programming and concurrency. Nevertheless, we argue that the actor paradigm can only be considered a good starting point to provide a more comprehensive response to such a fundamental and radical change in software development. Accordingly, the main objective of this thesis is to propose Agent-Oriented Programming (AOP) as a high-level general purpose programming paradigm, natural evolution of actors and objects, introducing a further level of human-inspired concepts for programming software systems, meant to simplify the design and programming of concurrent, distributed, reactive/interactive programs. To this end, in the dissertation first we construct the required background by studying the state-of-the-art of both actor-oriented and agent-oriented programming, and then we focus on the engineering of integrated programming technologies for developing agent-based systems in their classical application domains: artificial intelligence and distributed artificial intelligence. Then, we shift the perspective moving from the development of intelligent software systems, toward general purpose software development. Using the expertise maturated during the phase of background construction, we introduce a general-purpose programming language named simpAL, which founds its roots on general principles and practices of software development, and at the same time provides an agent-oriented level of abstraction for the engineering of general purpose software systems.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The study of ancient, undeciphered scripts presents unique challenges, that depend both on the nature of the problem and on the peculiarities of each writing system. In this thesis, I present two computational approaches that are tailored to two different tasks and writing systems. The first of these methods is aimed at the decipherment of the Linear A afraction signs, in order to discover their numerical values. This is achieved with a combination of constraint programming, ad-hoc metrics and paleographic considerations. The second main contribution of this thesis regards the creation of an unsupervised deep learning model which uses drawings of signs from ancient writing system to learn to distinguish different graphemes in the vector space. This system, which is based on techniques used in the field of computer vision, is adapted to the study of ancient writing systems by incorporating information about sequences in the model, mirroring what is often done in natural language processing. In order to develop this model, the Cypriot Greek Syllabary is used as a target, since this is a deciphered writing system. Finally, this unsupervised model is adapted to the undeciphered Cypro-Minoan and it is used to answer open questions about this script. In particular, by reconstructing multiple allographs that are not agreed upon by paleographers, it supports the idea that Cypro-Minoan is a single script and not a collection of three script like it was proposed in the literature. These results on two different tasks shows that computational methods can be applied to undeciphered scripts, despite the relatively low amount of available data, paving the way for further advancement in paleography using these methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis intends to investigate two aspects of Constraint Handling Rules (CHR). It proposes a compositional semantics and a technique for program transformation. CHR is a concurrent committed-choice constraint logic programming language consisting of guarded rules, which transform multi-sets of atomic formulas (constraints) into simpler ones until exhaustion [Frü06] and it belongs to the declarative languages family. It was initially designed for writing constraint solvers but it has recently also proven to be a general purpose language, being as it is Turing equivalent [SSD05a]. Compositionality is the first CHR aspect to be considered. A trace based compositional semantics for CHR was previously defined in [DGM05]. The reference operational semantics for such a compositional model was the original operational semantics for CHR which, due to the propagation rule, admits trivial non-termination. In this thesis we extend the work of [DGM05] by introducing a more refined trace based compositional semantics which also includes the history. The use of history is a well-known technique in CHR which permits us to trace the application of propagation rules and consequently it permits trivial non-termination avoidance [Abd97, DSGdlBH04]. Naturally, the reference operational semantics, of our new compositional one, uses history to avoid trivial non-termination too. Program transformation is the second CHR aspect to be considered, with particular regard to the unfolding technique. Said technique is an appealing approach which allows us to optimize a given program and in more detail to improve run-time efficiency or spaceconsumption. Essentially it consists of a sequence of syntactic program manipulations which preserve a kind of semantic equivalence called qualified answer [Frü98], between the original program and the transformed ones. The unfolding technique is one of the basic operations which is used by most program transformation systems. It consists in the replacement of a procedure-call by its definition. In CHR every conjunction of constraints can be considered as a procedure-call, every CHR rule can be considered as a procedure and the body of said rule represents the definition of the call. While there is a large body of literature on transformation and unfolding of sequential programs, very few papers have addressed this issue for concurrent languages. We define an unfolding rule, show its correctness and discuss some conditions in which it can be used to delete an unfolded rule while preserving the meaning of the original program. Finally, confluence and termination maintenance between the original and transformed programs are shown. This thesis is organized in the following manner. Chapter 1 gives some general notion about CHR. Section 1.1 outlines the history of programming languages with particular attention to CHR and related languages. Then, Section 1.2 introduces CHR using examples. Section 1.3 gives some preliminaries which will be used during the thesis. Subsequentely, Section 1.4 introduces the syntax and the operational and declarative semantics for the first CHR language proposed. Finally, the methodologies to solve the problem of trivial non-termination related to propagation rules are discussed in Section 1.5. Chapter 2 introduces a compositional semantics for CHR where the propagation rules are considered. In particular, Section 2.1 contains the definition of the semantics. Hence, Section 2.2 presents the compositionality results. Afterwards Section 2.3 expounds upon the correctness results. Chapter 3 presents a particular program transformation known as unfolding. This transformation needs a particular syntax called annotated which is introduced in Section 3.1 and its related modified operational semantics !0t is presented in Section 3.2. Subsequently, Section 3.3 defines the unfolding rule and prove its correctness. Then, in Section 3.4 the problems related to the replacement of a rule by its unfolded version are discussed and this in turn gives a correctness condition which holds for a specific class of rules. Section 3.5 proves that confluence and termination are preserved by the program modifications introduced. Finally, Chapter 4 concludes by discussing related works and directions for future work.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the last years, Intelligent Tutoring Systems have been a very successful way for improving learning experience. Many issues must be addressed until this technology can be defined mature. One of the main problems within the Intelligent Tutoring Systems is the process of contents authoring: knowledge acquisition and manipulation processes are difficult tasks because they require a specialised skills on computer programming and knowledge engineering. In this thesis we discuss a general framework for knowledge management in an Intelligent Tutoring System and propose a mechanism based on first order data mining to partially automate the process of knowledge acquisition that have to be used in the ITS during the tutoring process. Such a mechanism can be applied in Constraint Based Tutor and in the Pseudo-Cognitive Tutor. We design and implement a part of the proposed architecture, mainly the module of knowledge acquisition from examples based on first order data mining. We then show that the algorithm can be applied at least two different domains: first order algebra equation and some topics of C programming language. Finally we discuss the limitation of current approach and the possible improvements of the whole framework.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Generic programming is likely to become a new challenge for a critical mass of developers. Therefore, it is crucial to refine the support for generic programming in mainstream Object-Oriented languages — both at the design and at the implementation level — as well as to suggest novel ways to exploit the additional degree of expressiveness made available by genericity. This study is meant to provide a contribution towards bringing Java genericity to a more mature stage with respect to mainstream programming practice, by increasing the effectiveness of its implementation, and by revealing its full expressive power in real world scenario. With respect to the current research setting, the main contribution of the thesis is twofold. First, we propose a revised implementation for Java generics that greatly increases the expressiveness of the Java platform by adding reification support for generic types. Secondly, we show how Java genericity can be leveraged in a real world case-study in the context of the multi-paradigm language integration. Several approaches have been proposed in order to overcome the lack of reification of generic types in the Java programming language. Existing approaches tackle the problem of reification of generic types by defining new translation techniques which would allow for a runtime representation of generics and wildcards. Unfortunately most approaches suffer from several problems: heterogeneous translations are known to be problematic when considering reification of generic methods and wildcards. On the other hand, more sophisticated techniques requiring changes in the Java runtime, supports reified generics through a true language extension (where clauses) so that backward compatibility is compromised. In this thesis we develop a sophisticated type-passing technique for addressing the problem of reification of generic types in the Java programming language; this approach — first pioneered by the so called EGO translator — is here turned into a full-blown solution which reifies generic types inside the Java Virtual Machine (JVM) itself, thus overcoming both performance penalties and compatibility issues of the original EGO translator. Java-Prolog integration Integrating Object-Oriented and declarative programming has been the subject of several researches and corresponding technologies. Such proposals come in two flavours, either attempting at joining the two paradigms, or simply providing an interface library for accessing Prolog declarative features from a mainstream Object-Oriented languages such as Java. Both solutions have however drawbacks: in the case of hybrid languages featuring both Object-Oriented and logic traits, such resulting language is typically too complex, thus making mainstream application development an harder task; in the case of library-based integration approaches there is no true language integration, and some “boilerplate code” has to be implemented to fix the paradigm mismatch. In this thesis we develop a framework called PatJ which promotes seamless exploitation of Prolog programming in Java. A sophisticated usage of generics/wildcards allows to define a precise mapping between Object-Oriented and declarative features. PatJ defines a hierarchy of classes where the bidirectional semantics of Prolog terms is modelled directly at the level of the Java generic type-system.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Il problema dell'antibiotico-resistenza è un problema di sanità pubblica per affrontare il quale è necessario un sistema di sorveglianza basato sulla raccolta e l'analisi dei dati epidemiologici di laboratorio. Il progetto di dottorato è consistito nello sviluppo di una applicazione web per la gestione di tali dati di antibiotico sensibilità di isolati clinici utilizzabile a livello di ospedale. Si è creata una piattaforma web associata a un database relazionale per avere un’applicazione dinamica che potesse essere aggiornata facilmente inserendo nuovi dati senza dover manualmente modificare le pagine HTML che compongono l’applicazione stessa. E’ stato utilizzato il database open-source MySQL in quanto presenta numerosi vantaggi: estremamente stabile, elevate prestazioni, supportato da una grande comunità online ed inoltre gratuito. Il contenuto dinamico dell’applicazione web deve essere generato da un linguaggio di programmazione tipo “scripting” che automatizzi operazioni di inserimento, modifica, cancellazione, visualizzazione di larghe quantità di dati. E’ stato scelto il PHP, linguaggio open-source sviluppato appositamente per la realizzazione di pagine web dinamiche, perfettamente utilizzabile con il database MySQL. E’ stata definita l’architettura del database creando le tabelle contenenti i dati e le relazioni tra di esse: le anagrafiche, i dati relativi ai campioni, microrganismi isolati e agli antibiogrammi con le categorie interpretative relative al dato antibiotico. Definite tabelle e relazioni del database è stato scritto il codice associato alle funzioni principali: inserimento manuale di antibiogrammi, importazione di antibiogrammi multipli provenienti da file esportati da strumenti automatizzati, modifica/eliminazione degli antibiogrammi precedenti inseriti nel sistema, analisi dei dati presenti nel database con tendenze e andamenti relativi alla prevalenza di specie microbiche e alla chemioresistenza degli stessi, corredate da grafici. Lo sviluppo ha incluso continui test delle funzioni via via implementate usando reali dati clinici e sono stati introdotti appositi controlli e l’introduzione di una semplice e pulita veste grafica.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The thesis applies the ICC tecniques to the probabilistic polinomial complexity classes in order to get an implicit characterization of them. The main contribution lays on the implicit characterization of PP (which stands for Probabilistic Polynomial Time) class, showing a syntactical characterisation of PP and a static complexity analyser able to recognise if an imperative program computes in Probabilistic Polynomial Time. The thesis is divided in two parts. The first part focuses on solving the problem by creating a prototype of functional language (a probabilistic variation of lambda calculus with bounded recursion) that is sound and complete respect to Probabilistic Prolynomial Time. The second part, instead, reverses the problem and develops a feasible way to verify if a program, written with a prototype of imperative programming language, is running in Probabilistic polynomial time or not. This thesis would characterise itself as one of the first step for Implicit Computational Complexity over probabilistic classes. There are still open hard problem to investigate and try to solve. There are a lot of theoretical aspects strongly connected with these topics and I expect that in the future there will be wide attention to ICC and probabilistic classes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The central topic of this thesis is the study of algorithms for type checking, both from the programming language and from the proof-theoretic point of view. A type checking algorithm takes a program or a proof, represented as a syntactical object, and checks its validity with respect to a specification or a statement. It is a central piece of compilers and proof assistants. We postulate that since type checkers are at the interface between proof theory and program theory, their study can let these two fields mutually enrich each other. We argue by two main instances: first, starting from the problem of proof reuse, we develop an incremental type checker; secondly, starting from a type checking program, we evidence a novel correspondence between natural deduction and the sequent calculus.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Coastal flooding poses serious threats to coastal areas around the world, billions of dollars in damage to property and infrastructure, and threatens the lives of millions of people. Therefore, disaster management and risk assessment aims at detecting vulnerability and capacities in order to reduce coastal flood disaster risk. In particular, non-specialized researchers, emergency management personnel, and land use planners require an accurate, inexpensive method to determine and map risk associated with storm surge events and long-term sea level rise associated with climate change. This study contributes to the spatially evaluation and mapping of social-economic-environmental vulnerability and risk at sub-national scale through the development of appropriate tools and methods successfully embedded in a Web-GIS Decision Support System. A new set of raster-based models were studied and developed in order to be easily implemented in the Web-GIS framework with the purpose to quickly assess and map flood hazards characteristics, damage and vulnerability in a Multi-criteria approach. The Web-GIS DSS is developed recurring to open source software and programming language and its main peculiarity is to be available and usable by coastal managers and land use planners without requiring high scientific background in hydraulic engineering. The effectiveness of the system in the coastal risk assessment is evaluated trough its application to a real case study.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Internet of Things systems are pervasive systems evolved from cyber-physical to large-scale systems. Due to the number of technologies involved, software development involves several integration challenges. Among them, the ones preventing proper integration are those related to the system heterogeneity, and thus addressing interoperability issues. From a software engineering perspective, developers mostly experience the lack of interoperability in the two phases of software development: programming and deployment. On the one hand, modern software tends to be distributed in several components, each adopting its most-appropriate technology stack, pushing programmers to code in a protocol- and data-agnostic way. On the other hand, each software component should run in the most appropriate execution environment and, as a result, system architects strive to automate the deployment in distributed infrastructures. This dissertation aims to improve the development process by introducing proper tools to handle certain aspects of the system heterogeneity. Our effort focuses on three of these aspects and, for each one of those, we propose a tool addressing the underlying challenge. The first tool aims to handle heterogeneity at the transport and application protocol level, the second to manage different data formats, while the third to obtain optimal deployment. To realize the tools, we adopted a linguistic approach, i.e.\ we provided specific linguistic abstractions that help developers to increase the expressive power of the programming language they use, writing better solutions in more straightforward ways. To validate the approach, we implemented use cases to show that the tools can be used in practice and that they help to achieve the expected level of interoperability. In conclusion, to move a step towards the realization of an integrated Internet of Things ecosystem, we target programmers and architects and propose them to use the presented tools to ease the software development process.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Using Big Data and Natural Language Processing (NLP) tools, this dissertation investigates the narrative strategies that atypical actors can leverage to deal with the adverse reactions they often elicit. Extensive research shows that atypical actors, those who fail to abide by established contextual standards and norms, are subject to skepticism and face a higher risk of rejection. Indeed, atypical actors combine features and behaviors in unconventional ways, thereby generating confusion in the audience and instilling doubts about their propositions' legitimacy. However, the same atypicality is often cited as the precursor to socio-cultural innovation and a strategic act to expand the capacity for delivering valued goods and services. Contextualizing the conditions under which atypicality is celebrated or punished has been a significant theoretical challenge for scholars interested in reconciling this tension. Nevertheless, prior work has focused on audience side factors or on actor-side characteristics that are only scantily under an actor's control (e.g., status and reputation). This dissertation demonstrates that atypical actors can use strategically crafted narratives to mitigate against the audience’s negative response. In particular, when atypical actors evoke conventional features in their story, they are more likely to overcome the illegitimacy discount usually applied to them. Moreover, narratives become successful navigational devices for atypicality when atypical actors use a more abstract language. This simplifies classification and provides the audience with more flexibility to interpret and understand them.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The availability of a huge amount of source code from code archives and open-source projects opens up the possibility to merge machine learning, programming languages, and software engineering research fields. This area is often referred to as Big Code where programming languages are treated instead of natural languages while different features and patterns of code can be exploited to perform many useful tasks and build supportive tools. Among all the possible applications which can be developed within the area of Big Code, the work presented in this research thesis mainly focuses on two particular tasks: the Programming Language Identification (PLI) and the Software Defect Prediction (SDP) for source codes. Programming language identification is commonly needed in program comprehension and it is usually performed directly by developers. However, when it comes at big scales, such as in widely used archives (GitHub, Software Heritage), automation of this task is desirable. To accomplish this aim, the problem is analyzed from different points of view (text and image-based learning approaches) and different models are created paying particular attention to their scalability. Software defect prediction is a fundamental step in software development for improving quality and assuring the reliability of software products. In the past, defects were searched by manual inspection or using automatic static and dynamic analyzers. Now, the automation of this task can be tackled using learning approaches that can speed up and improve related procedures. Here, two models have been built and analyzed to detect some of the commonest bugs and errors at different code granularity levels (file and method levels). Exploited data and models’ architectures are analyzed and described in detail. Quantitative and qualitative results are reported for both PLI and SDP tasks while differences and similarities concerning other related works are discussed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

One of the most visionary goals of Artificial Intelligence is to create a system able to mimic and eventually surpass the intelligence observed in biological systems including, ambitiously, the one observed in humans. The main distinctive strength of humans is their ability to build a deep understanding of the world by learning continuously and drawing from their experiences. This ability, which is found in various degrees in all intelligent biological beings, allows them to adapt and properly react to changes by incrementally expanding and refining their knowledge. Arguably, achieving this ability is one of the main goals of Artificial Intelligence and a cornerstone towards the creation of intelligent artificial agents. Modern Deep Learning approaches allowed researchers and industries to achieve great advancements towards the resolution of many long-standing problems in areas like Computer Vision and Natural Language Processing. However, while this current age of renewed interest in AI allowed for the creation of extremely useful applications, a concerningly limited effort is being directed towards the design of systems able to learn continuously. The biggest problem that hinders an AI system from learning incrementally is the catastrophic forgetting phenomenon. This phenomenon, which was discovered in the 90s, naturally occurs in Deep Learning architectures where classic learning paradigms are applied when learning incrementally from a stream of experiences. This dissertation revolves around the Continual Learning field, a sub-field of Machine Learning research that has recently made a comeback following the renewed interest in Deep Learning approaches. This work will focus on a comprehensive view of continual learning by considering algorithmic, benchmarking, and applicative aspects of this field. This dissertation will also touch on community aspects such as the design and creation of research tools aimed at supporting Continual Learning research, and the theoretical and practical aspects concerning public competitions in this field.