12 resultados para Natural Language Processing
em Universitätsbibliothek Kassel, Universität Kassel, Germany
Resumo:
Ontologies have been established for knowledge sharing and are widely used as a means for conceptually structuring domains of interest. With the growing usage of ontologies, the problem of overlapping knowledge in a common domain becomes critical. In this short paper, we address two methods for merging ontologies based on Formal Concept Analysis: FCA-Merge and ONTEX. --- FCA-Merge is a method for merging ontologies following a bottom-up approach which offers a structural description of the merging process. The method is guided by application-specific instances of the given source ontologies. We apply techniques from natural language processing and formal concept analysis to derive a lattice of concepts as a structural result of FCA-Merge. The generated result is then explored and transformed into the merged ontology with human interaction. --- ONTEX is a method for systematically structuring the top-down level of ontologies. It is based on an interactive, top-down- knowledge acquisition process, which assures that the knowledge engineer considers all possible cases while avoiding redundant acquisition. The method is suited especially for creating/merging the top part(s) of the ontologies, where high accuracy is required, and for supporting the merging of two (or more) ontologies on that level.
Resumo:
This thesis aims at empowering software customers with a tool to build software tests them selves, based on a gradual refinement of natural language scenarios into executable visual test models. The process is divided in five steps: 1. First, a natural language parser is used to extract a graph of grammatical relations from the textual scenario descriptions. 2. The resulting graph is transformed into an informal story pattern by interpreting structurization rules based on Fujaba Story Diagrams. 3. While the informal story pattern can already be used by humans the diagram still lacks technical details, especially type information. To add them, a recommender based framework uses web sites and other resources to generate formalization rules. 4. As a preparation for the code generation the classes derived for formal story patterns are aligned across all story steps, substituting a class diagram. 5. Finally, a headless version of Fujaba is used to generate an executable JUnit test. The graph transformations used in the browser application are specified in a textual domain specific language and visualized as story pattern. Last but not least, only the heavyweight parsing (step 1) and code generation (step 5) are executed on the server side. All graph transformation steps (2, 3 and 4) are executed in the browser by an interpreter written in JavaScript/GWT. This result paves the way for online collaboration between global teams of software customers, IT business analysts and software developers.
Resumo:
Restarting automata are a restricted model of computation that was introduced by Jancar et.al. to model the so-called analysis by reduction. A computation of a restarting automaton consists of a sequence of cycles such that in each cycle the automaton performs exactly one rewrite step, which replaces a small part of the tape content by another, even shorter word. Thus, each language accepted by a restarting automaton belongs to the complexity class $CSL cap NP$. Here we consider a natural generalization of this model, called shrinking restarting automaton, where we do no longer insist on the requirement that each rewrite step decreases the length of the tape content. Instead we require that there exists a weight function such that each rewrite step decreases the weight of the tape content with respect to that function. The language accepted by such an automaton still belongs to the complexity class $CSL cap NP$. While it is still unknown whether the two most general types of one-way restarting automata, the RWW-automaton and the RRWW-automaton, differ in their expressive power, we will see that the classes of languages accepted by the shrinking RWW-automaton and the shrinking RRWW-automaton coincide. As a consequence of our proof, it turns out that there exists a reduction by morphisms from the language class $cL(RRWW)$ to the class $cL(RWW)$. Further, we will see that the shrinking restarting automaton is a rather robust model of computation. Finally, we will relate shrinking RRWW-automata to finite-change automata. This will lead to some new insights into the relationships between the classes of languages characterized by (shrinking) restarting automata and some well-known time and space complexity classes.
Resumo:
Analysis by reduction is a method used in linguistics for checking the correctness of sentences of natural languages. This method is modelled by restarting automata. All types of restarting automata considered in the literature up to now accept at least the deterministic context-free languages. Here we introduce and study a new type of restarting automaton, the so-called t-RL-automaton, which is an RL-automaton that is rather restricted in that it has a window of size one only, and that it works under a minimal acceptance condition. On the other hand, it is allowed to perform up to t rewrite (that is, delete) steps per cycle. Here we study the gap-complexity of these automata. The membership problem for a language that is accepted by a t-RL-automaton with a bounded number of gaps can be solved in polynomial time. On the other hand, t-RL-automata with an unbounded number of gaps accept NP-complete languages.
Resumo:
Restarting automata can be seen as analytical variants of classical automata as well as of regulated rewriting systems. We study a measure for the degree of nondeterminism of (context-free) languages in terms of deterministic restarting automata that are (strongly) lexicalized. This measure is based on the number of auxiliary symbols (categories) used for recognizing a language as the projection of its characteristic language onto its input alphabet. This type of recognition is typical for analysis by reduction, a method used in linguistics for the creation and verification of formal descriptions of natural languages. Our main results establish a hierarchy of classes of context-free languages and two hierarchies of classes of non-context-free languages that are based on the expansion factor of a language.
Resumo:
Land use is a crucial link between human activities and the natural environment and one of the main driving forces of global environmental change. Large parts of the terrestrial land surface are used for agriculture, forestry, settlements and infrastructure. Given the importance of land use, it is essential to understand the multitude of influential factors and resulting land use patterns. An essential methodology to study and quantify such interactions is provided by the adoption of land-use models. By the application of land-use models, it is possible to analyze the complex structure of linkages and feedbacks and to also determine the relevance of driving forces. Modeling land use and land use changes has a long-term tradition. In particular on the regional scale, a variety of models for different regions and research questions has been created. Modeling capabilities grow with steady advances in computer technology, which on the one hand are driven by increasing computing power on the other hand by new methods in software development, e.g. object- and component-oriented architectures. In this thesis, SITE (Simulation of Terrestrial Environments), a novel framework for integrated regional sland-use modeling, will be introduced and discussed. Particular features of SITE are the notably extended capability to integrate models and the strict separation of application and implementation. These features enable efficient development, test and usage of integrated land-use models. On its system side, SITE provides generic data structures (grid, grid cells, attributes etc.) and takes over the responsibility for their administration. By means of a scripting language (Python) that has been extended by language features specific for land-use modeling, these data structures can be utilized and manipulated by modeling applications. The scripting language interpreter is embedded in SITE. The integration of sub models can be achieved via the scripting language or by usage of a generic interface provided by SITE. Furthermore, functionalities important for land-use modeling like model calibration, model tests and analysis support of simulation results have been integrated into the generic framework. During the implementation of SITE, specific emphasis was laid on expandability, maintainability and usability. Along with the modeling framework a land use model for the analysis of the stability of tropical rainforest margins was developed in the context of the collaborative research project STORMA (SFB 552). In a research area in Central Sulawesi, Indonesia, socio-environmental impacts of land-use changes were examined. SITE was used to simulate land-use dynamics in the historical period of 1981 to 2002. Analogous to that, a scenario that did not consider migration in the population dynamics, was analyzed. For the calculation of crop yields and trace gas emissions, the DAYCENT agro-ecosystem model was integrated. In this case study, it could be shown that land-use changes in the Indonesian research area could mainly be characterized by the expansion of agricultural areas at the expense of natural forest. For this reason, the situation had to be interpreted as unsustainable even though increased agricultural use implied economic improvements and higher farmers' incomes. Due to the importance of model calibration, it was explicitly addressed in the SITE architecture through the introduction of a specific component. The calibration functionality can be used by all SITE applications and enables largely automated model calibration. Calibration in SITE is understood as a process that finds an optimal or at least adequate solution for a set of arbitrarily selectable model parameters with respect to an objective function. In SITE, an objective function typically is a map comparison algorithm capable of comparing a simulation result to a reference map. Several map optimization and map comparison methodologies are available and can be combined. The STORMA land-use model was calibrated using a genetic algorithm for optimization and the figure of merit map comparison measure as objective function. The time period for the calibration ranged from 1981 to 2002. For this period, respective reference land-use maps were compiled. It could be shown, that an efficient automated model calibration with SITE is possible. Nevertheless, the selection of the calibration parameters required detailed knowledge about the underlying land-use model and cannot be automated. In another case study decreases in crop yields and resulting losses in income from coffee cultivation were analyzed and quantified under the assumption of four different deforestation scenarios. For this task, an empirical model, describing the dependence of bee pollination and resulting coffee fruit set from the distance to the closest natural forest, was integrated. Land-use simulations showed, that depending on the magnitude and location of ongoing forest conversion, pollination services are expected to decline continuously. This results in a reduction of coffee yields of up to 18% and a loss of net revenues per hectare of up to 14%. However, the study also showed that ecological and economic values can be preserved if patches of natural vegetation are conservated in the agricultural landscape. -----------------------------------------------------------------------
Resumo:
Cooperative behaviour of agents within highly dynamic and nondeterministic domains is an active field of research. In particular establishing highly responsive teamwork, where agents are able to react on dynamic changes in the environment while facing unreliable communication and sensory noise, is an open problem. Moreover, modelling such responsive, cooperative behaviour is difficult. In this work, we specify a novel model for cooperative behaviour geared towards highly dynamic domains. In our approach, agents estimate each other’s decision and correct these estimations once they receive contradictory information. We aim at a comprehensive approach for agent teamwork featuring intuitive modelling capabilities for multi-agent activities, abstractions over activities and agents, and a clear operational semantic for the new model. This work encompasses a complete specification of the new language, ALICA.
Resumo:
Distributed systems are one of the most vital components of the economy. The most prominent example is probably the internet, a constituent element of our knowledge society. During the recent years, the number of novel network types has steadily increased. Amongst others, sensor networks, distributed systems composed of tiny computational devices with scarce resources, have emerged. The further development and heterogeneous connection of such systems imposes new requirements on the software development process. Mobile and wireless networks, for instance, have to organize themselves autonomously and must be able to react to changes in the environment and to failing nodes alike. Researching new approaches for the design of distributed algorithms may lead to methods with which these requirements can be met efficiently. In this thesis, one such method is developed, tested, and discussed in respect of its practical utility. Our new design approach for distributed algorithms is based on Genetic Programming, a member of the family of evolutionary algorithms. Evolutionary algorithms are metaheuristic optimization methods which copy principles from natural evolution. They use a population of solution candidates which they try to refine step by step in order to attain optimal values for predefined objective functions. The synthesis of an algorithm with our approach starts with an analysis step in which the wanted global behavior of the distributed system is specified. From this specification, objective functions are derived which steer a Genetic Programming process where the solution candidates are distributed programs. The objective functions rate how close these programs approximate the goal behavior in multiple randomized network simulations. The evolutionary process step by step selects the most promising solution candidates and modifies and combines them with mutation and crossover operators. This way, a description of the global behavior of a distributed system is translated automatically to programs which, if executed locally on the nodes of the system, exhibit this behavior. In our work, we test six different ways for representing distributed programs, comprising adaptations and extensions of well-known Genetic Programming methods (SGP, eSGP, and LGP), one bio-inspired approach (Fraglets), and two new program representations called Rule-based Genetic Programming (RBGP, eRBGP) designed by us. We breed programs in these representations for three well-known example problems in distributed systems: election algorithms, the distributed mutual exclusion at a critical section, and the distributed computation of the greatest common divisor of a set of numbers. Synthesizing distributed programs the evolutionary way does not necessarily lead to the envisaged results. In a detailed analysis, we discuss the problematic features which make this form of Genetic Programming particularly hard. The two Rule-based Genetic Programming approaches have been developed especially in order to mitigate these difficulties. In our experiments, at least one of them (eRBGP) turned out to be a very efficient approach and in most cases, was superior to the other representations.
Resumo:
Summary: Recent research on the evolution of language and verbal displays (e.g., Miller, 1999, 2000a, 2000b, 2002) indicated that language is not only the result of natural selection but serves as a sexually-selected fitness indicator that is an adaptation showing an individual’s suitability as a reproductive mate. Thus, language could be placed within the framework of concepts such as the handicap principle (Zahavi, 1975). There are several reasons for this position: Many linguistic traits are highly heritable (Stromswold, 2001, 2005), while naturally-selected traits are only marginally heritable (Miller, 2000a); men are more prone to verbal displays than women, who in turn judge the displays (Dunbar, 1996; Locke & Bogin, 2006; Lange, in press; Miller, 2000a; Rosenberg & Tunney, 2008); verbal proficiency universally raises especially male status (Brown, 1991); many linguistic features are handicaps (Miller, 2000a) in the Zahavian sense; most literature is produced by men at reproduction-relevant age (Miller, 1999). However, neither an experimental study investigating the causal relation between verbal proficiency and attractiveness, nor a study showing a correlation between markers of literary and mating success existed. In the current studies, it was aimed to fill these gaps. In the first one, I conducted a laboratory experiment. Videos in which an actor and an actress performed verbal self-presentations were the stimuli for counter-sex participants. Content was always alike, but the videos differed on three levels of verbal proficiency. Predictions were, among others, that (1) verbal proficiency increases mate value, but that (2) this applies more to male than to female mate value due to assumed past sex-different selection pressures causing women to be very demanding in mate choice (Trivers, 1972). After running a two-factorial analysis of variance with the variables sex and verbal proficiency as factors, the first hypothesis was supported with high effect size. For the second hypothesis, there was only a trend going in the predicted direction. Furthermore, it became evident that verbal proficiency affects long-term more than short-term mate value. In the second study, verbal proficiency as a menstrual cycle-dependent mate choice criterion was investigated. Basically the same materials as in the former study were used with only marginal changes in the used questionnaire. The hypothesis was that fertile women rate high verbal proficiency in men higher than non-fertile women because of verbal proficiency being a potential indicator of “good genes”. However, no significant result could be obtained in support of the hypothesis in the current study. In the third study, the hypotheses were: (1) most literature is produced by men at reproduction-relevant age. (2) The more works of high literary quality a male writer produces, the more mates and children he has. (3) Lyricists have higher mating success than non-lyric writers because of poetic language being a larger handicap than other forms of language. (4) Writing literature increases a man’s status insofar that his offspring shows a significantly higher male-to-female sex ratio than in the general population, as the Trivers-Willard hypothesis (Trivers & Willard, 1973) applied to literature predicts. In order to test these hypotheses, two famous literary canons were chosen. Extensive biographical research was conducted on the writers’ mating successes. The first hypothesis was confirmed; the second one, controlling for life age, only for number of mates but not entirely regarding number of children. The latter finding was discussed with respect to, among others, the availability of effective contraception especially in the 20th century. The third hypothesis was not satisfactorily supported. The fourth hypothesis was partially supported. For the 20th century part of the German list, the secondary sex ratio differed with high statistical significance from the ratio assumed to be valid for a general population.
Resumo:
In der psycholinguistischen Forschung ist die Annahme weitverbreitet, dass die Bewertung von Informationen hinsichtlich ihres Wahrheitsgehaltes oder ihrer Plausibilität (epistemische Validierung; Richter, Schroeder & Wöhrmann, 2009) ein strategischer, optionaler und dem Verstehen nachgeschalteter Prozess ist (z.B. Gilbert, 1991; Gilbert, Krull & Malone, 1990; Gilbert, Tafarodi & Malone, 1993; Herbert & Kübler, 2011). Eine zunehmende Anzahl an Studien stellt dieses Zwei-Stufen-Modell von Verstehen und Validieren jedoch direkt oder indirekt in Frage. Insbesondere Befunde zu Stroop-artigen Stimulus-Antwort-Kompatibilitätseffekten, die auftreten, wenn positive und negative Antworten orthogonal zum aufgaben-irrelevanten Wahrheitsgehalt von Sätzen abgegeben werden müssen (z.B. eine positive Antwort nach dem Lesen eines falschen Satzes oder eine negative Antwort nach dem Lesen eines wahren Satzes; epistemischer Stroop-Effekt, Richter et al., 2009), sprechen dafür, dass Leser/innen schon beim Verstehen eine nicht-strategische Überprüfung der Validität von Informationen vornehmen. Ausgehend von diesen Befunden war das Ziel dieser Dissertation eine weiterführende Überprüfung der Annahme, dass Verstehen einen nicht-strategischen, routinisierten, wissensbasierten Validierungsprozesses (epistemisches Monitoring; Richter et al., 2009) beinhaltet. Zu diesem Zweck wurden drei empirische Studien mit unterschiedlichen Schwerpunkten durchgeführt. Studie 1 diente der Untersuchung der Fragestellung, ob sich Belege für epistemisches Monitoring auch bei Informationen finden lassen, die nicht eindeutig wahr oder falsch, sondern lediglich mehr oder weniger plausibel sind. Mithilfe des epistemischen Stroop-Paradigmas von Richter et al. (2009) konnte ein Kompatibilitätseffekt von aufgaben-irrelevanter Plausibilität auf die Latenzen positiver und negativer Antworten in zwei unterschiedlichen experimentellen Aufgaben nachgewiesen werden, welcher dafür spricht, dass epistemisches Monitoring auch graduelle Unterschiede in der Übereinstimmung von Informationen mit dem Weltwissen berücksichtigt. Darüber hinaus belegen die Ergebnisse, dass der epistemische Stroop-Effekt tatsächlich auf Plausibilität und nicht etwa auf der unterschiedlichen Vorhersagbarkeit von plausiblen und unplausiblen Informationen beruht. Das Ziel von Studie 2 war die Prüfung der Hypothese, dass epistemisches Monitoring keinen evaluativen Mindset erfordert. Im Gegensatz zu den Befunden anderer Autoren (Wiswede, Koranyi, Müller, Langner, & Rothermund, 2013) zeigte sich in dieser Studie ein Kompatibilitätseffekt des aufgaben-irrelevanten Wahrheitsgehaltes auf die Antwortlatenzen in einer vollständig nicht-evaluativen Aufgabe. Die Ergebnisse legen nahe, dass epistemisches Monitoring nicht von einem evaluativen Mindset, möglicherweise aber von der Tiefe der Verarbeitung abhängig ist. Studie 3 beleuchtete das Verhältnis von Verstehen und Validieren anhand einer Untersuchung der Online-Effekte von Plausibilität und Vorhersagbarkeit auf Augenbewegungen beim Lesen kurzer Texte. Zusätzlich wurde die potentielle Modulierung dieser Effeke durch epistemische Marker, die die Sicherheit von Informationen anzeigen (z.B. sicherlich oder vielleicht), untersucht. Entsprechend der Annahme eines schnellen und nicht-strategischen epistemischen Monitoring-Prozesses zeigten sich interaktive Effekte von Plausibilität und dem Vorhandensein epistemischer Marker auf Indikatoren früher Verstehensprozesse. Dies spricht dafür, dass die kommunizierte Sicherheit von Informationen durch den Monitoring-Prozess berücksichtigt wird. Insgesamt sprechen die Befunde gegen eine Konzeptualisierung von Verstehen und Validieren als nicht-überlappenden Stufen der Informationsverarbeitung. Vielmehr scheint eine Bewertung des Wahrheitsgehalts oder der Plausibilität basierend auf dem Weltwissen – zumindest in gewissem Ausmaß – eine obligatorische und nicht-strategische Komponente des Sprachverstehens zu sein. Die Bedeutung der Befunde für aktuelle Modelle des Sprachverstehens und Empfehlungen für die weiterführende Forschung zum Vehältnis von Verstehen und Validieren werden aufgezeigt.
Resumo:
Im Rahmen dieser interdisziplinären Doktorarbeit wird eine (Al)GaN Halbleiteroberflächenmodifikation untersucht, mit dem Ziel eine verbesserte Grenzfläche zwischen dem Material und dem Dielektrikum zu erzeugen. Aufgrund von Oberflächenzuständen zeigen GaN basierte HEMT Strukturen üblicherweise große Einsatzspannungsverschiebungen. Bisher wurden zur Grenzflächenmodifikation besonders die Entfernung von Verunreinigungen wie Sauerstoff oder Kohlenstoff analysiert. Die nasschemischen Oberflächenbehandlungen werden vor der Abscheidung des Dielektrikums durchgeführt, wobei die Kontaminationen jedoch nicht vollständig entfernt werden können. In dieser Arbeit werden Modifikationen der Oberfläche in wässrigen Lösungen, in Gasen sowie in Plasma analysiert. Detaillierte Untersuchungen zeigen, dass die inerte (0001) c-Ebene der Oberfläche kaum reagiert, sondern hauptsächlich die weniger polaren r- und m- Ebenen. Dies kann deutlich beim Defektätzen sowie bei der thermischen Oxidation beobachtet werden. Einen weiteren Ansatz zur Oberflächenmodifikation stellen Plasmabehandlungen dar. Hierbei wird die Oberflächenterminierung durch eine nukleophile Substitution mit Lewis Basen, wie Fluorid, Chlorid oder Oxid verändert, wodurch sich die Elektronegativitätsdifferenz zwischen dem Metall und dem Anion im Vergleich zur Metall-Stickstoff Bindung erhöht. Dies führt gleichzeitig zu einer Erhöhung der Potentialdifferenz des Schottky Kontakts. Sauerstoff oder Fluor besitzen die nötige thermische Stabilität um während einer Silicium-nitridabscheidung an der (Al)GaN Oberfläche zu bleiben. Sauerstoffvariationen an der Oberfläche werden in NH3 bei 700°C, welches die nötigen Bedingungen für die Abscheidung darstellen, immer zu etwa 6-8% reduziert – solche Grenzflächen zeigen deswegen auch keine veränderten Ergebnisse in Einsatzspannungsuntersuchungen. Im Gegensatz dazu zeigt die fluorierte Oberfläche ein völlig neues elektrisches Verhalten: ein neuer dominanter Oberflächendonator mit einem schnellen Trapping und Detrapping Verhalten wird gefunden. Das Energieniveau dieses neuen, stabilen Donators liegt um ca. 0,5 eV tiefer in der Bandlücke als die ursprünglichen Energieniveaus der Oberflächenzustände. Physikalisch-chemische Oberflächen- und Grenzflächenuntersuchung mit XPS, AES oder SIMS erlauben keine eindeutige Schlussfolgerung, ob das Fluor nach der Si3N4 Abscheidung tatsächlich noch an der Grenzfläche vorhanden ist, oder einfach eine stabilere Oberflächenrekonstruktion induziert wurde, bei welcher es selbst nicht beteiligt ist. In beiden Fällen ist der neue Donator in einer Konzentration von 4x1013 at/cm-2 vorhanden. Diese Dichte entspricht einer Oberflächenkonzentration von etwa 1%, was genau an der Nachweisgrenze der spektroskopischen Methoden liegt. Jedoch werden die elektrischen Oberflächeneigenschaften durch die Oberflächenmodifikation deutlich verändert und ermöglichen eine potentiell weiter optimierbare Grenzfläche.