7 resultados para Simplification of Ontologies
em Helda - Digital Repository of University of Helsinki
Resumo:
This dissertation is a theoretical study of finite-state based grammars used in natural language processing. The study is concerned with certain varieties of finite-state intersection grammars (FSIG) whose parsers define regular relations between surface strings and annotated surface strings. The study focuses on the following three aspects of FSIGs: (i) Computational complexity of grammars under limiting parameters In the study, the computational complexity in practical natural language processing is approached through performance-motivated parameters on structural complexity. Each parameter splits some grammars in the Chomsky hierarchy into an infinite set of subset approximations. When the approximations are regular, they seem to fall into the logarithmic-time hierarchyand the dot-depth hierarchy of star-free regular languages. This theoretical result is important and possibly relevant to grammar induction. (ii) Linguistically applicable structural representations Related to the linguistically applicable representations of syntactic entities, the study contains new bracketing schemes that cope with dependency links, left- and right branching, crossing dependencies and spurious ambiguity. New grammar representations that resemble the Chomsky-Schützenberger representation of context-free languages are presented in the study, and they include, in particular, representations for mildly context-sensitive non-projective dependency grammars whose performance-motivated approximations are linear time parseable. (iii) Compilation and simplification of linguistic constraints Efficient compilation methods for certain regular operations such as generalized restriction are presented. These include an elegant algorithm that has already been adopted as the approach in a proprietary finite-state tool. In addition to the compilation methods, an approach to on-the-fly simplifications of finite-state representations for parse forests is sketched. These findings are tightly coupled with each other under the theme of locality. I argue that the findings help us to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing. Avainsanat: syntactic parsing, finite-state automata, dependency grammar, first-order logic, linguistic performance, star-free regular approximations, mildly context-sensitive grammars
Resumo:
Ecology and evolutionary biology is the study of life on this planet. One of the many methods applied to answering the great diversity of questions regarding the lives and characteristics of individual organisms, is the utilization of mathematical models. Such models are used in a wide variety of ways. Some help us to reason, functioning as aids to, or substitutes for, our own fallible logic, thus making argumentation and thinking clearer. Models which help our reasoning can lead to conceptual clarification; by expressing ideas in algebraic terms, the relationship between different concepts become clearer. Other mathematical models are used to better understand yet more complicated models, or to develop mathematical tools for their analysis. Though helping us to reason and being used as tools in the craftmanship of science, many models do not tell us much about the real biological phenomena we are, at least initially, interested in. The main reason for this is that any mathematical model is a simplification of the real world, reducing the complexity and variety of interactions and idiosynchracies of individual organisms. What such models can tell us, however, both is and has been very valuable throughout the history of ecology and evolution. Minimally, a model simplifying the complex world can tell us that in principle, the patterns produced in a model could also be produced in the real world. We can never know how different a simplified mathematical representation is from the real world, but the similarity models do strive for, gives us confidence that their results could apply. This thesis deals with a variety of different models, used for different purposes. One model deals with how one can measure and analyse invasions; the expanding phase of invasive species. Earlier analyses claims to have shown that such invasions can be a regulated phenomena, that higher invasion speeds at a given point in time will lead to a reduction in speed. Two simple mathematical models show that analysis on this particular measure of invasion speed need not be evidence of regulation. In the context of dispersal evolution, two models acting as proof-of-principle are presented. Parent-offspring conflict emerges when there are different evolutionary optima for adaptive behavior for parents and offspring. We show that the evolution of dispersal distances can entail such a conflict, and that under parental control of dispersal (as, for example, in higher plants) wider dispersal kernels are optimal. We also show that dispersal homeostasis can be optimal; in a setting where dispersal decisions (to leave or stay in a natal patch) are made, strategies that divide their seeds or eggs into fractions that disperse or not, as opposed to randomized for each seed, can prevail. We also present a model of the evolution of bet-hedging strategies; evolutionary adaptations that occur despite their fitness, on average, being lower than a competing strategy. Such strategies can win in the long run because they have a reduced variance in fitness coupled with a reduction in mean fitness, and fitness is of a multiplicative nature across generations, and therefore sensitive to variability. This model is used for conceptual clarification; by developing a population genetical model with uncertain fitness and expressing genotypic variance in fitness as a product between individual level variance and correlations between individuals of a genotype. We arrive at expressions that intuitively reflect two of the main categorizations of bet-hedging strategies; conservative vs diversifying and within- vs between-generation bet hedging. In addition, this model shows that these divisions in fact are false dichotomies.
Resumo:
With respect to resource management and environmental impact, organic farming offers rationales for agricultural sustainability. However, agronomic productivity is usually higher with conventional farming. This work aimed at investigating two factors of major importance for the agronomic productivity of organic crop husbandry, nitrogen (N) supply through symbiotic N fixation (SNF) and weed occurrence. Perennial red clover-grass leys and spring cereal crops subjected to regular agricultural practices were studied on 34 organic farms located in the southern and the north-western coastal regions of Finland. Herbage growth, clover content as a proportion of the ley and extent of SNF in perennial leys, and the occurrence of weed species and weed-crop competition in spring cereal stands were related to climate conditions, soil properties, and management measures. The herbage accumulated from the first and the second cut of one- and two-year-old leys averaged 7.5 t DM ha-1 (SD ± 1.7 t DM ha-1); the clover content averaged 43.9% (SD ± 18.8%). Along with the clover content, herbage production decreased with ley age. Radiation use efficiency (RUE) correlated positively with clover proportion but despite low clover contents, three-year-old leys were still productive with regard to RUE. SNF in the accumulated annual growth of one- and two-year-old leys averaged 247.5 kg N ha-1 yr-1 (SD ± 114.4 kg N ha-1 yr-1). It was supposed that if red clover-grass leys constituted 40% of the rotation, then the mean N supply by SNF would be able to sustain two or three succeeding cereal crops (green manure and forage ley, respectively), yielding 3.0 to 4.0 t grain ha-1. Being a function of clover biomass, the SNF increased from the first to the second cut and thereafter declined with ley age. Coefficients of variation of clover contents (and SNF) between and within fields were around 50%, which was about twice as high as those of herbage production. The lower were the clover contents, the higher were the within-field variations of clover as a proportion of the ley. Low clover contents in one-year-old leys and increasing variability with ley age suggested that red clover growth was limited by poor establishment and poor overwintering. The proportions of clover in leys were lower and their variability was higher in the northwest than in the south. Soil properties, primarily texture and structure, had a major impact on clover proportion and herbage production, which largely explained regional differences in ley growth. Within-field variability of soil properties can be amended through site-specific measures, including drainage, liming, and applications of organic manures and mineral fertilizers. Overwintering and the persistence of leys can be improved by the choice of winter-hardy varieties, careful establishment and the appropriate harvest regime. Mean grain yields of spring cereal crops amounted to 3.2 t ha-1 in the south and 3.6 t ha-1 in the northwest. At 570 and 565 m-2 for the south and northwest respectively, mean weed densities did not differ between the regions, whereas the respective mean weed biomass of 697 and 1594 kg dry weight ha-1, respectively did differ. Weed abundance varied remarkably between single fields. The number of weed species was higher in the south than in the northwest. For example, Fumaria officinalis and Lamium spp. were found only in the south. Frequencies and abundances of Lapsana communis, Myosotis arvensis, Polygonum aviculare, Tripleurospermum inodorum, and Vicia spp. were higher in the south, whereas those of Elymus repens, Persicaria spp. and Spergula arvensis were higher in the northwest. The number of years since conversion to organic farming, i.e. long-term management, was one of the variables that explained the abundance of single weed species. E. repens was the weed species whose biomass increased most with the duration of organic farming. Another significant variable was crop biomass, which was affected by short-term management. The presence of different weed species was related to the duration of organic farming and to low crop yield. This finding demonstrated that it was not the organic farming regime per se, which resulted in high weed infestation and low yielding crops, but failures in the understanding and the management of organic farming systems. Successful weed control relies on farm- and field-specific long- and short-term management approaches. The agronomic productivity of ley and spring cereal crops managed by full-time farmers with an interest in organic farming was on the same level as of the mean for conventional farming. Given the many options for further improvements of the agronomic performance of organic arable systems, organic farming offers foundations for the development of sustainable agriculture. The main threat to the sustainability of farming in Finland, both conventional and organic, is the spatial separation of crop production and animal husbandry by region, along with the simplification of associated crop rotations.
Resumo:
Intensification of agricultural land-use has been was shown to be the key reason behind declines in wildlife species associated with farmland. Accession to the European Union is regarded as a potential threat to the farmland biota of its new member states. In my thesis I looked at scenarios of agricultural development across the Baltic states of Estonia, Latvia and Lithuania, and the ways they are seen to affect farmed environments as a habitat of farmland bird species. I looked at the effects of major farmed habitats across the region, and assessed the role of spatial organisation of farmed habitats. I also evaluated the direction and magnitude of changes in bird communities following progression of farmland land-use from a relatively less intensive to the most intensive type within each country. Different aspects of the structural complexity of farmland were critical for supporting farmland birds. There was a clear indication that the more intensively farmed areas across the region provided habitat for fewer bird species and individuals, and intensification of field management was reflected in a tangible decrease in farmland bird abundance. The second part of the thesis, based on interviews in Estonia and Finland, is devoted to farmers interest in and knowledge of farmland wildlife, their understanding of the concept of biodiversity, and awareness of causes behind species declines. I examined the relationship between farmers interest and their willingness to undertake practices favouring farmland wildlife. Many farmers viewed biodiversity from a narrow perspective. In Finland farmers expressed higher concern about the decline in common farmland species than in Estonia. In both countries farmers rated intensification of agriculture as the major driving force behind declines. The expressed interest in wildlife positively correlated with willingness to undertake wildlife-friendly measures. Only farmers with agri-environment contracts targeted specifically at biodiversity were more knowledgeable about practical on-farm activities favouring wildlife, and were more willing to employ them that the rest. The results suggest that, by contributing to simplification of the farmland structure, homogenisation of crops, and increase in intensity of field use, EU agricultural policies will have a detrimental effect on farmland bird populations in Eastern Europe. Farmers are on the whole positive to the idea of supporting wildlife on their farms, and are concerned about declines, but they require payments to offset their income loss and extra work. I propose ways of further improving and better targeting of the agri-environment schemes in the region. I argue that with a foreseen tripling of cereal yields across the region, the EU Council s target of halting biodiversity decline in the EU by 2010 may not be realistic unless considerable improvements are made in conservation safeguards within the EU agricultural policy for the region.
Resumo:
Topic detection and tracking (TDT) is an area of information retrieval research the focus of which revolves around news events. The problems TDT deals with relate to segmenting news text into cohesive stories, detecting something new, previously unreported, tracking the development of a previously reported event, and grouping together news that discuss the same event. The performance of the traditional information retrieval techniques based on full-text similarity has remained inadequate for online production systems. It has been difficult to make the distinction between same and similar events. In this work, we explore ways of representing and comparing news documents in order to detect new events and track their development. First, however, we put forward a conceptual analysis of the notions of topic and event. The purpose is to clarify the terminology and align it with the process of news-making and the tradition of story-telling. Second, we present a framework for document similarity that is based on semantic classes, i.e., groups of words with similar meaning. We adopt people, organizations, and locations as semantic classes in addition to general terms. As each semantic class can be assigned its own similarity measure, document similarity can make use of ontologies, e.g., geographical taxonomies. The documents are compared class-wise, and the outcome is a weighted combination of class-wise similarities. Third, we incorporate temporal information into document similarity. We formalize the natural language temporal expressions occurring in the text, and use them to anchor the rest of the terms onto the time-line. Upon comparing documents for event-based similarity, we look not only at matching terms, but also how near their anchors are on the time-line. Fourth, we experiment with an adaptive variant of the semantic class similarity system. The news reflect changes in the real world, and in order to keep up, the system has to change its behavior based on the contents of the news stream. We put forward two strategies for rebuilding the topic representations and report experiment results. We run experiments with three annotated TDT corpora. The use of semantic classes increased the effectiveness of topic tracking by 10-30\% depending on the experimental setup. The gain in spotting new events remained lower, around 3-4\%. The anchoring the text to a time-line based on the temporal expressions gave a further 10\% increase the effectiveness of topic tracking. The gains in detecting new events, again, remained smaller. The adaptive systems did not improve the tracking results.
Resumo:
Ingarden (1962, 1964) postulates that artworks exist in an “Objective purely intentional” way. According to this view, objectivity and subjectivity are opposed forms of existence, parallel to the opposition between realism and idealism. Using arguments of cognitive science, experimental psychology, and semiotics, this lecture proposes that, particularly in the aesthetic phenomena, realism and idealism are not pure oppositions; rather they are aspects of a single process of cognition in different strata. Furthermore, the concept of realism can be conceived as an empirical extreme of idealism, and the concept of idealism can be conceived as a pre-operative extreme of realism. Both kind of systems of knowledge are mutually associated by a synecdoche, performing major tasks of mental order and categorisation. This contribution suggests that the supposed opposition between objectivity and subjectivity, raises, first of all, a problem of translatability, more than a problem of existential categories. Synecdoche seems to be a very basic transaction of the mind, establishing ontologies (in the more Ingardean way of the term). Wegrzecki (1994, 220) defines ontology as “the central domain of philosophy to which other its parts directly or indirectly refer”. Thus, ontology operates within philosophy as the synecdoche does within language, pointing the sense of the general into the particular and/or viceversa. The many affinities and similarities between different sign systems, like those found across the interrelationships of the arts, are embedded into a transversal, synecdochic intersemiosis. An important question, from this view, is whether Ingardean’s pure objectivities lie basically on the impossibility of translation, therefore being absolute self-referential constructions. In such a case, it would be impossible to translate pure intentionality into something else, like acts or products.
Resumo:
Based on the Aristotelian criterion referred to as 'abductio', Peirce suggests a method of hypothetical inference, which operates in a different way than the deductive and inductive methods. “Abduction is nothing but guessing” (Peirce, 7.219). This principle is of extreme value for the study of our understanding of mathematical self-similarity in both of its typical presentations: relative or absolute. For the first case, abduction incarnates the quantitative/qualitative relationships of a self-similar object or process; for the second case, abduction makes understandable the statistical treatment of self-similarity, 'guessing' the continuity of geometric features to the infinity through the use of a systematic stereotype (for instance, the assumption that the general shape of the Sierpiński triangle continuates identically into its particular shapes). The metaphor coined by Peirce, of an exact map containig itself the same exact map (a map of itself), is not only the most important precedent of Mandelbrot’s problem of measuring the boundaries of a continuous irregular surface with a logarithmic ruler, but also still being a useful abstraction for the conceptualisation of relative and absolute self-similarity, and its mechanisms of implementation. It is useful, also, for explaining some of the most basic geometric ontologies as mental constructions: in the notion of infinite convergence of points in the corners of a triangle, or the intuition for defining two parallel straight lines as two lines in a plane that 'never' intersect.