965 resultados para Natural language processing (Computer science)
Resumo:
Linguistic modelling is a rather new branch of mathematics that is still undergoing rapid development. It is closely related to fuzzy set theory and fuzzy logic, but knowledge and experience from other fields of mathematics, as well as other fields of science including linguistics and behavioral sciences, is also necessary to build appropriate mathematical models. This topic has received considerable attention as it provides tools for mathematical representation of the most common means of human communication - natural language. Adding a natural language level to mathematical models can provide an interface between the mathematical representation of the modelled system and the user of the model - one that is sufficiently easy to use and understand, but yet conveys all the information necessary to avoid misinterpretations. It is, however, not a trivial task and the link between the linguistic and computational level of such models has to be established and maintained properly during the whole modelling process. In this thesis, we focus on the relationship between the linguistic and the mathematical level of decision support models. We discuss several important issues concerning the mathematical representation of meaning of linguistic expressions, their transformation into the language of mathematics and the retranslation of mathematical outputs back into natural language. In the first part of the thesis, our view of the linguistic modelling for decision support is presented and the main guidelines for building linguistic models for real-life decision support that are the basis of our modeling methodology are outlined. From the theoretical point of view, the issues of representation of meaning of linguistic terms, computations with these representations and the retranslation process back into the linguistic level (linguistic approximation) are studied in this part of the thesis. We focus on the reasonability of operations with the meanings of linguistic terms, the correspondence of the linguistic and mathematical level of the models and on proper presentation of appropriate outputs. We also discuss several issues concerning the ethical aspects of decision support - particularly the loss of meaning due to the transformation of mathematical outputs into natural language and the issue or responsibility for the final decisions. In the second part several case studies of real-life problems are presented. These provide background and necessary context and motivation for the mathematical results and models presented in this part. A linguistic decision support model for disaster management is presented here – formulated as a fuzzy linear programming problem and a heuristic solution to it is proposed. Uncertainty of outputs, expert knowledge concerning disaster response practice and the necessity of obtaining outputs that are easy to interpret (and available in very short time) are reflected in the design of the model. Saaty’s analytic hierarchy process (AHP) is considered in two case studies - first in the context of the evaluation of works of art, where a weak consistency condition is introduced and an adaptation of AHP for large matrices of preference intensities is presented. The second AHP case-study deals with the fuzzified version of AHP and its use for evaluation purposes – particularly the integration of peer-review into the evaluation of R&D outputs is considered. In the context of HR management, we present a fuzzy rule based evaluation model (academic faculty evaluation is considered) constructed to provide outputs that do not require linguistic approximation and are easily transformed into graphical information. This is achieved by designing a specific form of fuzzy inference. Finally the last case study is from the area of humanities - psychological diagnostics is considered and a linguistic fuzzy model for the interpretation of outputs of multidimensional questionnaires is suggested. The issue of the quality of data in mathematical classification models is also studied here. A modification of the receiver operating characteristics (ROC) method is presented to reflect variable quality of data instances in the validation set during classifier performance assessment. Twelve publications on which the author participated are appended as a third part of this thesis. These summarize the mathematical results and provide a closer insight into the issues of the practicalapplications that are considered in the second part of the thesis.
Resumo:
Software is a key component in many of our devices and products that we use every day. Most customers demand not only that their devices should function as expected but also that the software should be of high quality, reliable, fault tolerant, efficient, etc. In short, it is not enough that a calculator gives the correct result of a calculation, we want the result instantly, in the right form, with minimal use of battery, etc. One of the key aspects for succeeding in today's industry is delivering high quality. In most software development projects, high-quality software is achieved by rigorous testing and good quality assurance practices. However, today, customers are asking for these high quality software products at an ever-increasing pace. This leaves the companies with less time for development. Software testing is an expensive activity, because it requires much manual work. Testing, debugging, and verification are estimated to consume 50 to 75 per cent of the total development cost of complex software projects. Further, the most expensive software defects are those which have to be fixed after the product is released. One of the main challenges in software development is reducing the associated cost and time of software testing without sacrificing the quality of the developed software. It is often not enough to only demonstrate that a piece of software is functioning correctly. Usually, many other aspects of the software, such as performance, security, scalability, usability, etc., need also to be verified. Testing these aspects of the software is traditionally referred to as nonfunctional testing. One of the major challenges with non-functional testing is that it is usually carried out at the end of the software development process when most of the functionality is implemented. This is due to the fact that non-functional aspects, such as performance or security, apply to the software as a whole. In this thesis, we study the use of model-based testing. We present approaches to automatically generate tests from behavioral models for solving some of these challenges. We show that model-based testing is not only applicable to functional testing but also to non-functional testing. In its simplest form, performance testing is performed by executing multiple test sequences at once while observing the software in terms of responsiveness and stability, rather than the output. The main contribution of the thesis is a coherent model-based testing approach for testing functional and performance related issues in software systems. We show how we go from system models, expressed in the Unified Modeling Language, to test cases and back to models again. The system requirements are traced throughout the entire testing process. Requirements traceability facilitates finding faults in the design and implementation of the software. In the research field of model-based testing, many new proposed approaches suffer from poor or the lack of tool support. Therefore, the second contribution of this thesis is proper tool support for the proposed approach that is integrated with leading industry tools. We o er independent tools, tools that are integrated with other industry leading tools, and complete tool-chains when necessary. Many model-based testing approaches proposed by the research community suffer from poor empirical validation in an industrial context. In order to demonstrate the applicability of our proposed approach, we apply our research to several systems, including industrial ones.
Resumo:
The advancement of science and technology makes it clear that no single perspective is any longer sufficient to describe the true nature of any phenomenon. That is why the interdisciplinary research is gaining more attention overtime. An excellent example of this type of research is natural computing which stands on the borderline between biology and computer science. The contribution of research done in natural computing is twofold: on one hand, it sheds light into how nature works and how it processes information and, on the other hand, it provides some guidelines on how to design bio-inspired technologies. The first direction in this thesis focuses on a nature-inspired process called gene assembly in ciliates. The second one studies reaction systems, as a modeling framework with its rationale built upon the biochemical interactions happening within a cell. The process of gene assembly in ciliates has attracted a lot of attention as a research topic in the past 15 years. Two main modelling frameworks have been initially proposed in the end of 1990s to capture ciliates’ gene assembly process, namely the intermolecular model and the intramolecular model. They were followed by other model proposals such as templatebased assembly and DNA rearrangement pathways recombination models. In this thesis we are interested in a variation of the intramolecular model called simple gene assembly model, which focuses on the simplest possible folds in the assembly process. We propose a new framework called directed overlap-inclusion (DOI) graphs to overcome the limitations that previously introduced models faced in capturing all the combinatorial details of the simple gene assembly process. We investigate a number of combinatorial properties of these graphs, including a necessary property in terms of forbidden induced subgraphs. We also introduce DOI graph-based rewriting rules that capture all the operations of the simple gene assembly model and prove that they are equivalent to the string-based formalization of the model. Reaction systems (RS) is another nature-inspired modeling framework that is studied in this thesis. Reaction systems’ rationale is based upon two main regulation mechanisms, facilitation and inhibition, which control the interactions between biochemical reactions. Reaction systems is a complementary modeling framework to traditional quantitative frameworks, focusing on explicit cause-effect relationships between reactions. The explicit formulation of facilitation and inhibition mechanisms behind reactions, as well as the focus on interactions between reactions (rather than dynamics of concentrations) makes their applicability potentially wide and useful beyond biological case studies. In this thesis, we construct a reaction system model corresponding to the heat shock response mechanism based on a novel concept of dominance graph that captures the competition on resources in the ODE model. We also introduce for RS various concepts inspired by biology, e.g., mass conservation, steady state, periodicity, etc., to do model checking of the reaction systems based models. We prove that the complexity of the decision problems related to these properties varies from P to NP- and coNP-complete to PSPACE-complete. We further focus on the mass conservation relation in an RS and introduce the conservation dependency graph to capture the relation between the species and also propose an algorithm to list the conserved sets of a given reaction system.
Resumo:
Human beings have always strived to preserve their memories and spread their ideas. In the beginning this was always done through human interpretations, such as telling stories and creating sculptures. Later, technological progress made it possible to create a recording of a phenomenon; first as an analogue recording onto a physical object, and later digitally, as a sequence of bits to be interpreted by a computer. By the end of the 20th century technological advances had made it feasible to distribute media content over a computer network instead of on physical objects, thus enabling the concept of digital media distribution. Many digital media distribution systems already exist, and their continued, and in many cases increasing, usage is an indicator for the high interest in their future enhancements and enriching. By looking at these digital media distribution systems, we have identified three main areas of possible improvement: network structure and coordination, transport of content over the network, and the encoding used for the content. In this thesis, our aim is to show that improvements in performance, efficiency and availability can be done in conjunction with improvements in software quality and reliability through the use of formal methods: mathematical approaches to reasoning about software so that we can prove its correctness, together with the desirable properties. We envision a complete media distribution system based on a distributed architecture, such as peer-to-peer networking, in which different parts of the system have been formally modelled and verified. Starting with the network itself, we show how it can be formally constructed and modularised in the Event-B formalism, such that we can separate the modelling of one node from the modelling of the network itself. We also show how the piece selection algorithm in the BitTorrent peer-to-peer transfer protocol can be adapted for on-demand media streaming, and how this can be modelled in Event-B. Furthermore, we show how modelling one peer in Event-B can give results similar to simulating an entire network of peers. Going further, we introduce a formal specification language for content transfer algorithms, and show that having such a language can make these algorithms easier to understand. We also show how generating Event-B code from this language can result in less complexity compared to creating the models from written specifications. We also consider the decoding part of a media distribution system by showing how video decoding can be done in parallel. This is based on formally defined dependencies between frames and blocks in a video sequence; we have shown that also this step can be performed in a way that is mathematically proven correct. Our modelling and proving in this thesis is, in its majority, tool-based. This provides a demonstration of the advance of formal methods as well as their increased reliability, and thus, advocates for their more wide-spread usage in the future.
Resumo:
Subshifts are sets of configurations over an infinite grid defined by a set of forbidden patterns. In this thesis, we study two-dimensional subshifts offinite type (2D SFTs), where the underlying grid is Z2 and the set of for-bidden patterns is finite. We are mainly interested in the interplay between the computational power of 2D SFTs and their geometry, examined through the concept of expansive subdynamics. 2D SFTs with expansive directions form an interesting and natural class of subshifts that lie between dimensions 1 and 2. An SFT that has only one non-expansive direction is called extremely expansive. We prove that in many aspects, extremely expansive 2D SFTs display the totality of behaviours of general 2D SFTs. For example, we construct an aperiodic extremely expansive 2D SFT and we prove that the emptiness problem is undecidable even when restricted to the class of extremely expansive 2D SFTs. We also prove that every Medvedev class contains an extremely expansive 2D SFT and we provide a characterization of the sets of directions that can be the set of non-expansive directions of a 2D SFT. Finally, we prove that for every computable sequence of 2D SFTs with an expansive direction, there exists a universal object that simulates all of the elements of the sequence. We use the so called hierarchical, self-simulating or fixed-point method for constructing 2D SFTs which has been previously used by Ga´cs, Durand, Romashchenko and Shen.