16 resultados para graph representation
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Biomedical natural language processing (BioNLP) is a subfield of natural language processing, an area of computational linguistics concerned with developing programs that work with natural language: written texts and speech. Biomedical relation extraction concerns the detection of semantic relations such as protein-protein interactions (PPI) from scientific texts. The aim is to enhance information retrieval by detecting relations between concepts, not just individual concepts as with a keyword search. In recent years, events have been proposed as a more detailed alternative for simple pairwise PPI relations. Events provide a systematic, structural representation for annotating the content of natural language texts. Events are characterized by annotated trigger words, directed and typed arguments and the ability to nest other events. For example, the sentence “Protein A causes protein B to bind protein C” can be annotated with the nested event structure CAUSE(A, BIND(B, C)). Converted to such formal representations, the information of natural language texts can be used by computational applications. Biomedical event annotations were introduced by the BioInfer and GENIA corpora, and event extraction was popularized by the BioNLP'09 Shared Task on Event Extraction. In this thesis we present a method for automated event extraction, implemented as the Turku Event Extraction System (TEES). A unified graph format is defined for representing event annotations and the problem of extracting complex event structures is decomposed into a number of independent classification tasks. These classification tasks are solved using SVM and RLS classifiers, utilizing rich feature representations built from full dependency parsing. Building on earlier work on pairwise relation extraction and using a generalized graph representation, the resulting TEES system is capable of detecting binary relations as well as complex event structures. We show that this event extraction system has good performance, reaching the first place in the BioNLP'09 Shared Task on Event Extraction. Subsequently, TEES has achieved several first ranks in the BioNLP'11 and BioNLP'13 Shared Tasks, as well as shown competitive performance in the binary relation Drug-Drug Interaction Extraction 2011 and 2013 shared tasks. The Turku Event Extraction System is published as a freely available open-source project, documenting the research in detail as well as making the method available for practical applications. In particular, in this thesis we describe the application of the event extraction method to PubMed-scale text mining, showing how the developed approach not only shows good performance, but is generalizable and applicable to large-scale real-world text mining projects. Finally, we discuss related literature, summarize the contributions of the work and present some thoughts on future directions for biomedical event extraction. This thesis includes and builds on six original research publications. The first of these introduces the analysis of dependency parses that leads to development of TEES. The entries in the three BioNLP Shared Tasks, as well as in the DDIExtraction 2011 task are covered in four publications, and the sixth one demonstrates the application of the system to PubMed-scale text mining.
Resumo:
With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.
Resumo:
Viimeisten vuosien aikana laajakaistaoperaattoreiden laajakaistaverkot ovat nopeiden ja kiinteähintaisten laajakaistaliittymien johdosta kasvaneet suuriksi kokonaisuuksiksi. Kokonaisuuksia hallitaan erilaisilla verkonhallintatyökaluilla. Verkonhallintatyökalut sisältävät suuren määrän eri tasoista tietoa laitteista ja laitteiden välisistä suhteista. Kokonaisuuksien hahmottaminen ilman tiedoista rakennettua kuvaa on vaikeaa ja hidasta. Laajakaistaverkon topologian visualisoinnissa muodostetaan kuva laitteista ja niiden välisistä suhteista. Visualisoitua kuvaa voidaan käyttää osana verkonhallintatyökalua, jolloin käyttäjälle muodostuu nopeasti näkymä verkon laitteista ja rakenteesta eli topologiasta. Visualisoinnissa kuvan piirto-ongelma täytyy muuttaa graafin piirto-ongelmaksi. Graafin piirto-ongelmassa verkon rakennetta käsitellään graafina, joka mahdollistaa kuvan muodostamisen automaattisia piirtomenetelmiä hyväksikäyttäen. Halutunlainen ulkoasu kuvalle muodostetaan automaattisilla piirtomenetelmillä, joilla laitteiden ja laitteiden välisten suhteiden esitystapoja voidaan muuttaa. Esitystavoilla voidaan muuttaa esimerkiksi laitteiden muotoa, väriä ja kokoa. Esitystapojen lisäksi piirtomenetelmien tärkein tehtävä on laskea laitteiden sijaintien koordinaattien arvot, jotka loppujen lopuksi määräävät koko kuvan rakenteen. Koordinaattien arvot lasketaan piirtoalgoritmeilla, joista voimiin perustuvat algoritmit sopivat parhaiten laajakaistaverkkojen laitteiden sijaintien laskemiseen. Tämän diplomityön käytännön työssä toteutettiin laajakaistaverkon topologian visualisointityökalu.
Resumo:
This thesis addresses the problem of computing the minimal and maximal diameter of the Cayley graph of Coxeter groups. We first present and assert relevant parts of polytope theory and related Coxeter theory. After this, a method of contracting the orthogonal projections of a polytope from Rd onto R2 and R3, d ¸ 3 is presented. This method is the Equality Set Projection algorithm that requires a constant number of linearprogramming problems per facet of the projection in the absence of degeneracy. The ESP algorithm allows us to compute also projected geometric diameters of high-dimensional polytopes. A representation set of projected polytopes is presented to illustrate the methods adopted in this thesis.
Resumo:
De förändrade ansatserna inom feministisk utvecklingsekonomi för med sig nya sätt att tala om kvinnor, män och utveckling. Genom att analysera texter skrivna inom området feministisk ekonomi från 1960-talet fram till början av 2000-talet dokumenterar den föreliggande studien på vilket sätt språket hos textproducenter inom utvecklingsekonomi konstituerar och är beroende av dessa skribenters inställning till utvecklingsfrågor och till kvinnor och män. Analysen fokuserar på hur aktiverings- och passiveringsprocesser används i representationen av de två huvuddeltagarna, kvinnor och män, hur begreppet genus introduceras och hur utvecklingsfrågor förändras genom ansatser, över tid och mellan genrer. Den teoretiska ramen sträcker sig över olika discipliner: systemisk funktionell grammatik och kritisk diskursanalys, men även organisatorisk diskursanalys och utvecklingsstudier. Texterna som valts för analysen härstammar från tre olika källor: planer från världskvinnokonferenserna organiserade av Förenta Nationerna, resolutioner om kvinnor och utveckling antagna av Förenta Nationernas generalförsamling samt handlingsplaner för kvinnor och utveckling författade av Förenta Nationernas livsmedels- och jordbruksorganisation FAO. Den lingvistiska analysmetoden bygger på det system av roller och sätt att representera deltagare som utvecklats av Halliday och Van Leeuwen. För varje årtionde och varje genre granskar studien förändringarna i processtyper och deltagarroller, samt förändringen av fokus på kvinnorelaterade frågor och konceptualiseringen av genus. Den kvantitativa analysen kompletteras och förstärks av en detaljerad analys av textfragment från olika tidpunkter och ansatser. Studiens resultat är av grammatisk och lexikal natur och de är relaterade till genus, genre och tid. Studien visar att aktiveringsprocesserna är betydligt talrikare än passiveringsprocesserna i representationen av kvinnor. En bättre förståelse av deltagarrepresentation uppnås dock via en omgruppering av de grammatiska processerna i identifierande, aktiverande och riktade processer. Skiftet från fokus på kvinnor till fokus på genus är inte så mycket en förändring av processerna som representerar deltagarna, utan mer en förändring av retoriken i ansatserna och deras fokus: från integration av kvinnor till kvinnors empowerment, från kvinnors situation till genusrelationer, från brådskande tillägg till social konflikt och samarbete.
Resumo:
The aim of this study is to analyse the content of the interdisciplinary conversations in Göttingen between 1949 and 1961. The task is to compare models for describing reality presented by quantum physicists and theologians. Descriptions of reality indifferent disciplines are conditioned by the development of the concept of reality in philosophy, physics and theology. Our basic problem is stated in the question: How is it possible for the intramental image to match the external object?Cartesian knowledge presupposes clear and distinct ideas in the mind prior to observation resulting in a true correspondence between the observed object and the cogitative observing subject. The Kantian synthesis between rationalism and empiricism emphasises an extended character of representation. The human mind is not a passive receiver of external information, but is actively construing intramental representations of external reality in the epistemological process. Heidegger's aim was to reach a more primordial mode of understanding reality than what is possible in the Cartesian Subject-Object distinction. In Heidegger's philosophy, ontology as being-in-the-world is prior to knowledge concerning being. Ontology can be grasped only in the totality of being (Dasein), not only as an object of reflection and perception. According to Bohr, quantum mechanics introduces an irreducible loss in representation, which classically understood is a deficiency in knowledge. The conflicting aspects (particle and wave pictures) in our comprehension of physical reality, cannot be completely accommodated into an entire and coherent model of reality. What Bohr rejects is not realism, but the classical Einsteinian version of it. By the use of complementary descriptions, Bohr tries to save a fundamentally realistic position. The fundamental question in Barthian theology is the problem of God as an object of theological discourse. Dialectics is Barth¿s way to express knowledge of God avoiding a speculative theology and a human-centred religious self-consciousness. In Barthian theology, the human capacity for knowledge, independently of revelation, is insufficient to comprehend the being of God. Our knowledge of God is real knowledge in revelation and our words are made to correspond with the divine reality in an analogy of faith. The point of the Bultmannian demythologising programme was to claim the real existence of God beyond our faculties. We cannot simply define God as a human ideal of existence or a focus of values. The theological programme of Bultmann emphasised the notion that we can talk meaningfully of God only insofar as we have existential experience of his intervention. Common to all these twentieth century philosophical, physical and theological positions, is a form of anti-Cartesianism. Consequently, in regard to their epistemology, they can be labelled antirealist. This common insight also made it possible to find a common meeting point between the different disciplines. In this study, the different standpoints from all three areas and the conversations in Göttingen are analysed in the frameworkof realism/antirealism. One of the first tasks in the Göttingen conversations was to analyse the nature of the likeness between the complementary structures inquantum physics introduced by Niels Bohr and the dialectical forms in the Barthian doctrine of God. The reaction against epistemological Cartesianism, metaphysics of substance and deterministic description of reality was the common point of departure for theologians and physicists in the Göttingen discussions. In his complementarity, Bohr anticipated the crossing of traditional epistemic boundaries and the generalisation of epistemological strategies by introducing interpretative procedures across various disciplines.
Resumo:
The use of domain-specific languages (DSLs) has been proposed as an approach to cost-e ectively develop families of software systems in a restricted application domain. Domain-specific languages in combination with the accumulated knowledge and experience of previous implementations, can in turn be used to generate new applications with unique sets of requirements. For this reason, DSLs are considered to be an important approach for software reuse. However, the toolset supporting a particular domain-specific language is also domain-specific and is per definition not reusable. Therefore, creating and maintaining a DSL requires additional resources that could be even larger than the savings associated with using them. As a solution, di erent tool frameworks have been proposed to simplify and reduce the cost of developments of DSLs. Developers of tool support for DSLs need to instantiate, customize or configure the framework for a particular DSL. There are di erent approaches for this. An approach is to use an application programming interface (API) and to extend the basic framework using an imperative programming language. An example of a tools which is based on this approach is Eclipse GEF. Another approach is to configure the framework using declarative languages that are independent of the underlying framework implementation. We believe this second approach can bring important benefits as this brings focus to specifying what should the tool be like instead of writing a program specifying how the tool achieves this functionality. In this thesis we explore this second approach. We use graph transformation as the basic approach to customize a domain-specific modeling (DSM) tool framework. The contributions of this thesis includes a comparison of di erent approaches for defining, representing and interchanging software modeling languages and models and a tool architecture for an open domain-specific modeling framework that e ciently integrates several model transformation components and visual editors. We also present several specific algorithms and tool components for DSM framework. These include an approach for graph query based on region operators and the star operator and an approach for reconciling models and diagrams after executing model transformation programs. We exemplify our approach with two case studies MICAS and EFCO. In these studies we show how our experimental modeling tool framework has been used to define tool environments for domain-specific languages.
Resumo:
"Helmiä sioille", pärlor för svin, säger man på finska om någonting bra och fint som tas emot av en mottagare som inte vill eller har ingen förmåga att förstå, uppskatta eller utnyttja hela den potential som finns hos det mottagna föremålet, är ointresserad av den eller gillar den inte. För sådana relativt stabila flerordiga uttryck, som är lagrade i språkbrukarnas minnen och som demonstrerar olika slags oregelbundna drag i sin struktur använder man inom lingvistiken bl.a. termerna "idiom" eller "fraseologiska enheter". Som en oregelbundenhet kan man t.ex. beskriva det faktum att betydelsen hos uttrycket inte är densamma som man skulle komma till ifall man betraktade det som en vanlig regelbunden fras. En annan oregelbundenhet, som idiomforskare har observerat, ligger i den begränsade förmågan att varieras i form och betydelse, som många idiom har jämfört med regelbundna fraser. Därför talas det ofta om "grundform" och "grundbetydelse" hos idiom och variationen avses som avvikelse från dessa. Men när man tittar på ett stort antal förekomstexempel av idiom i språkbruk, märker man att många av dem tillåter variation, t.o.m. i sådan utsträckning att gränserna mellan en variant och en "grundform" suddas ut, och istället för ett idiom råkar vi plötsligt på en "familj" av flera besläktade uttryck. Allt detta väcker frågan om hur dessa uttryck egentligen ska vara representerade i språket. I avhandlingen utförs en kritisk granskning av olika tidigare tillvägagångssätt att beskriva fraseologiska enheter i syfte att klargöra vilka svårigheter deras struktur och variation erbjuder för den lingvistiska teorin. Samtidigt presenteras ett alternativt sätt att beskriva dessa uttryck. En systematisk och formell modell som utvecklas i denna avhandling integrerar en beskrivning av idiom på många olika språkliga nivåer och skildrar deras variation i form av ett nätverk och som ett resultat av samspel mellan idiomets struktur och kontexter där det förekommer, samt av interaktion med andra fasta uttryck. Modellen bygger på en fördjupande, språkbrukbaserad analys av det finska idiomet "X HEITTÄÄ HELMIÄ SIOILLE" (X kastar pärlor för svin).
Resumo:
The goal of this thesis is to estimate the effect of the form of knowledge representation on the efficiency of knowledge sharing. The objectives include the design of an experimental framework which would allow to establish this effect, data collection, and statistical analysis of the collected data. The study follows the experimental quantitative design. The experimental questionnaire features three sample forms of knowledge: text, mind maps, concept maps. In the interview, these forms are presented to an interviewee, afterwards the knowledge sharing time and knowledge sharing quality are measured. According to the statistical analysis of 76 interviews, text performs worse in both knowledge sharing time and quality compared to visualized forms of knowledge representation. However, mind maps and concept maps do not differ in knowledge sharing time and quality, since this difference is not statistically significant. Since visualized structured forms of knowledge perform better than unstructured text in knowledge sharing, it is advised for companies to foster the usage of these forms in knowledge sharing processes inside the company. Aside of performance in knowledge sharing, the visualized structured forms are preferable due the possibility of their usage in the system of ontological knowledge management within an enterprise.