28 resultados para State-based Specifications
em Helda - Digital Repository of University of Helsinki
Resumo:
This dissertation is a theoretical study of finite-state based grammars used in natural language processing. The study is concerned with certain varieties of finite-state intersection grammars (FSIG) whose parsers define regular relations between surface strings and annotated surface strings. The study focuses on the following three aspects of FSIGs: (i) Computational complexity of grammars under limiting parameters In the study, the computational complexity in practical natural language processing is approached through performance-motivated parameters on structural complexity. Each parameter splits some grammars in the Chomsky hierarchy into an infinite set of subset approximations. When the approximations are regular, they seem to fall into the logarithmic-time hierarchyand the dot-depth hierarchy of star-free regular languages. This theoretical result is important and possibly relevant to grammar induction. (ii) Linguistically applicable structural representations Related to the linguistically applicable representations of syntactic entities, the study contains new bracketing schemes that cope with dependency links, left- and right branching, crossing dependencies and spurious ambiguity. New grammar representations that resemble the Chomsky-Schützenberger representation of context-free languages are presented in the study, and they include, in particular, representations for mildly context-sensitive non-projective dependency grammars whose performance-motivated approximations are linear time parseable. (iii) Compilation and simplification of linguistic constraints Efficient compilation methods for certain regular operations such as generalized restriction are presented. These include an elegant algorithm that has already been adopted as the approach in a proprietary finite-state tool. In addition to the compilation methods, an approach to on-the-fly simplifications of finite-state representations for parse forests is sketched. These findings are tightly coupled with each other under the theme of locality. I argue that the findings help us to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing. Avainsanat: syntactic parsing, finite-state automata, dependency grammar, first-order logic, linguistic performance, star-free regular approximations, mildly context-sensitive grammars
Resumo:
Finite-state methods have been adopted widely in computational morphology and related linguistic applications. To enable efficient development of finite-state based linguistic descriptions, these methods should be a freely available resource for academic language research and the language technology industry. The following needs can be identified: (i) a registry that maps the existing approaches, implementations and descriptions, (ii) managing the incompatibilities of the existing tools, (iii) increasing synergy and complementary functionality of the tools, (iv) persistent availability of the tools used to manipulate the archived descriptions, (v) an archive for free finite-state based tools and linguistic descriptions. Addressing these challenges contributes to building a common research infrastructure for advanced language technology.
Resumo:
There are numerous formats for writing spellcheckers for open-source systems and there are many descriptions for languages written in these formats. Similarly, for word hyphenation by computer there are TEX rules for many languages. In this paper we demonstrate a method for converting these spell-checking lexicons and hyphenation rule sets into finite-state automata, and present a new finite-state based system for writer’s tools used in current open-source software such as Firefox, OpenOffice.org and enchant via the spell-checking library voikko.
Resumo:
The trees in the Penn Treebank have a standard representation that involves complete balanced bracketing. In this article, an alternative for this standard representation of the tree bank is proposed. The proposed representation for the trees is loss-less, but it reduces the total number of brackets by 28%. This is possible by omitting the redundant pairs of special brackets that encode initial and final embedding, using a technique proposed by Krauwer and des Tombe (1981). In terms of the paired brackets, the maximum nesting depth in sentences decreases by 78%. The 99.9% coverage is achieved with only five non-top levels of paired brackets. The observed shallowness of the reduced bracketing suggests that finite-state based methods for parsing and searching could be a feasible option for tree bank processing.
Resumo:
The trees in the Penn Treebank have a standard representation that involves complete balanced bracketing. In this article, an alternative for this standard representation of the tree bank is proposed. The proposed representation for the trees is loss-less, but it reduces the total number of brackets by 28%. This is possible by omitting the redundant pairs of special brackets that encode initial and final embedding, using a technique proposed by Krauwer and des Tombe (1981). In terms of the paired brackets, the maximum nesting depth in sentences decreases by 78%. The 99.9% coverage is achieved with only five non-top levels of paired brackets. The observed shallowness of the reduced bracketing suggests that finite-state based methods for parsing and searching could be a feasible option for tree bank processing.
Resumo:
Many active pharmaceutical ingredients (APIs) have both anhydrate and hydrate forms. Due to the different physicochemical properties of solid forms, the changes in solid-state may result in therapeutic, pharmaceutical, legal and commercial problems. In order to obtain good solid dosage form quality and performance, there is a constant need to understand and control these phase transitions during manufacturing and storage. Thus it is important to detect and also quantify the possible transitions between the different forms. In recent years, vibrational spectroscopy has become an increasingly popular tool to characterise the solid-state forms and their phase transitions. It offers several advantages over other characterisation techniques including an ability to obtain molecular level information, minimal sample preparation, and the possibility of monitoring changes non-destructively in-line. Dehydration is the phase transition of hydrates which is frequently encountered during the dosage form production and storage. The aim of the present thesis was to investigate the dehydration behaviour of diverse pharmaceutical hydrates by near infrared (NIR), Raman and terahertz pulsed spectroscopic (TPS) monitoring together with multivariate data analysis. The goal was to reveal new perspectives for investigation of the dehydration at the molecular level. Solid-state transformations were monitored during dehydration of diverse hydrates on hot-stage. The results obtained from qualitative experiments were used to develop a method and perform the quantification of the solid-state forms during process induced dehydration in a fluidised bed dryer. Both in situ and in-line process monitoring and quantification was performed. This thesis demonstrated the utility of vibrational spectroscopy techniques and multivariate modelling to monitor and investigate dehydration behaviour in situ and during fluidised bed drying. All three spectroscopic methods proved complementary in the study of dehydration. NIR spectroscopy models could quantify the solid-state forms in the binary system, but were unable to quantify all the forms in the quaternary system. Raman spectroscopy models on the other hand could quantify all four solid-state forms that appeared upon isothermal dehydration. The speed of spectroscopic methods makes them applicable for monitoring dehydration and the quantification of multiple forms was performed during phase transition. Thus the solid-state structure information at the molecular level was directly obtained. TPS detected the intermolecular phonon modes and Raman spectroscopy detected mostly the changes in intramolecular vibrations. Both techniques revealed information about the crystal structure changes. NIR spectroscopy, on the other hand was more sensitive to water content and hydrogen bonding environment of water molecules. This study provides a basis for real time process monitoring using vibrational spectroscopy during pharmaceutical manufacturing.
Resumo:
The aim of this dissertation is to provide conceptual tools for the social scientist for clarifying, evaluating and comparing explanations of social phenomena based on formal mathematical models. The focus is on relatively simple theoretical models and simulations, not statistical models. These studies apply a theory of explanation according to which explanation is about tracing objective relations of dependence, knowledge of which enables answers to contrastive why and how-questions. This theory is developed further by delineating criteria for evaluating competing explanations and by applying the theory to social scientific modelling practices and to the key concepts of equilibrium and mechanism. The dissertation is comprised of an introductory essay and six published original research articles. The main theses about model-based explanations in the social sciences argued for in the articles are the following. 1) The concept of explanatory power, often used to argue for the superiority of one explanation over another, compasses five dimensions which are partially independent and involve some systematic trade-offs. 2) All equilibrium explanations do not causally explain the obtaining of the end equilibrium state with the multiple possible initial states. Instead, they often constitutively explain the macro property of the system with the micro properties of the parts (together with their organization). 3) There is an important ambivalence in the concept mechanism used in many model-based explanations and this difference corresponds to a difference between two alternative research heuristics. 4) Whether unrealistic assumptions in a model (such as a rational choice model) are detrimental to an explanation provided by the model depends on whether the representation of the explanatory dependency in the model is itself dependent on the particular unrealistic assumptions. Thus evaluating whether a literally false assumption in a model is problematic requires specifying exactly what is supposed to be explained and by what. 5) The question of whether an explanatory relationship depends on particular false assumptions can be explored with the process of derivational robustness analysis and the importance of robustness analysis accounts for some of the puzzling features of the tradition of model-building in economics. 6) The fact that economists have been relatively reluctant to use true agent-based simulations to formulate explanations can partially be explained by the specific ideal of scientific understanding implicit in the practise of orthodox economics.
Resumo:
The purpose of the research was to study how Finnish lower-stage schools participating in the international network of UNESCO schools, also called the Associated Schools Project (ASP), prepare their students for the future at the level of their school-based curriculums. In the research, the future trends were discussed, and the importance of their consideration in educational practice was explained from a global viewpoint: Based on the examination of today's problematic world state, and development trends characterized by globalization, the challenges and demands set for schooling and education in the future were discussed. Understanding the significance of an individual's action and responsibility was considered to be the central resource for building a more just and sustainable future. The study was grounded on a theoretical model developed by the researcher, which combined the models of Dalin & Rust (1996) and UNESCO (Delors et al. 1996) about future-oriented learning. The model consists of four basic elements of curriculum; "Nature", "Culture", "Myself", and "Others", and four dimension of learning; "Learning to know", "Learning to do", "Learning to live together" and "Learning to be". The model represents the holistic aspect of educational theory, and its aim is to maintain a balance between its different components. The research material composed of ten lower-stage UNESCO schools' school-based curriculums. They were analyzed using the theoretical model by the methology of content analysis. The research results were notably consistent between the different schools. They showed cultural learning and learning concerned with "myself" to be clearly more emphasized than learning referring to nature and other people. In addition, they reflected the central position of subjects, knowledge and skills, thus leaving the development of the pupils' personalities, and particularly learning concerned with living with other people, in a marginal role. The question about whether the schools prepare for the future interms of their curriculums, was discussed in the light of the results. The research offered a way and a model to approach the relationship between education and the future, and to evaluate schools' future-orientation. Based on the results, the schools are suggested to lay more stress on learning concerned with nature and other people, and focus more on developing the mental capasities of their pupils and competencies they need for living with other people. Above all, what the present societies require of schools is education which produces balanced and broadly aware human beings who have the mental strength to face the challenges of the future and abilities to direct it along the lines they desire. Keywords: future, curriculum, content analysis
Resumo:
Basidiomycetous white-rot fungi are the only organisms that can efficiently decompose all the components of wood. Moreover, white-rot fungi possess the ability to mineralize recalcitrant lignin polymer with their extracellular, oxidative lignin-modifying enzymes (LMEs), i.e. laccase, lignin peroxidase (LiP), manganese peroxidase (MnP), and versatile peroxidase (VP). Within one white-rot fungal species LMEs are typically present as several isozymes encoded by multiple genes. This study focused on two effi cient lignin-degrading white-rot fungal species, Phlebia radiata and Dichomitus squalens. Molecular level knowledge of the LMEs of the Finnish isolate P. radiata FBCC43 (79, ATCC 64658) was complemented with cloning and characterization of a new laccase (Pr-lac2), two new LiP-encoding genes (Pr-lip1, Pr-lip4), and Pr-lip3 gene that has been previously described only at cDNAlevel. Also, two laccase-encoding genes (Ds-lac3, Ds-lac4) of D. squalens were cloned and characterized for the first time. Phylogenetic analysis revealed close evolutionary relationships between the P. radiata LiP isozymes. Distinct protein phylogeny for both P. radiata and D. squalens laccases suggested different physiological functions for the corresponding enzymes. Supplementation of P. radiata liquid culture medium with excess Cu2+ notably increased laccase activity and good fungal growth was achieved in complex medium rich with organic nitrogen. Wood is the natural substrate of lignin-degrading white-rot fungi, supporting production of enzymes and metabolites needed for fungal growth and the breakdown of lignocellulose. In this work, emphasis was on solid-state wood or wood-containing cultures that mimic the natural growth conditions of white-rot fungi. Transcript analyses showed that wood promoted expression of all the presently known LME-encoding genes of P. radiata and laccase-encoding genes of D. squalens. Expression of the studied individual LME-encoding genes of P. radiata and D. squalens was unequal in transcript quantities and apparently time-dependent, thus suggesting the importance of several distinct LMEs within one fungal species. In addition to LMEs, white-rot fungi secrete other compounds that are important in decomposition of wood and lignin. One of these compounds is oxalic acid, which is a common metabolite of wood-rotting fungi. Fungi produce also oxalic-acid degrading enzymes of which the most widespread is oxalate decarboxylase (ODC). However, the role of ODC in fungi is still ambiguous with propositions from regulation of intra and extracellular oxalic acid levels to a function in primary growth and concomitant production of ATP. In this study, intracellular ODC activity was detected in four white-rot fungal species, and D. squalens showed the highest ODC activity upon exposure to oxalic acid. Oxalic acid was the most common organic acid secreted by the ODC-positive white-rot fungi and the only organic acid detected in wood cultures. The ODC-encoding gene Ds-odc was cloned from two strains of D. squalens showing the first characterization of an odc-gene from a white-rot polypore species. Biochemical properties of the D. squalens ODC resembled those described for other basidiomycete ODCs. However, the translated amino acid sequence of Ds-odc has a novel N-terminal primary structure with a repetitive Ala-Ser-rich region of ca 60 amino acid residues in length. Expression of the Ds-odc transcripts suggested a constitutive metabolic role for the corresponding ODC enzyme. According to the results, it is proposed that ODC may have an essential implication for the growth and basic metabolism of wood-decaying fungi.
Resumo:
There is intense activity in the area of theoretical chemistry of gold. It is now possible to predict new molecular species, and more recently, solids by combining relativistic methodology with isoelectronic thinking. In this thesis we predict a series of solid sheet-type crystals for Group-11 cyanides, MCN (M=Cu, Ag, Au), and Group-2 and 12 carbides MC2 (M=Be-Ba, Zn-Hg). The idea of sheets is then extended to nanostrips which can be bent to nanorings. The bending energies and deformation frequencies can be systematized by treating these molecules as an elastic bodies. In these species Au atoms act as an 'intermolecular glue'. Further suggested molecular species are the new uncongested aurocarbons, and the neutral Au_nHg_m clusters. Many of the suggested species are expected to be stabilized by aurophilic interactions. We also estimate the MP2 basis-set limit of the aurophilicity for the model compounds [ClAuPH_3]_2 and [P(AuPH_3)_4]^+. Beside investigating the size of the basis-set applied, our research confirms that the 19-VE TZVP+2f level, used a decade ago, already produced 74 % of the present aurophilic attraction energy for the [ClAuPH_3]_2 dimer. Likewise we verify the preferred C4v structure for the [P(AuPH_3)_4]^+ cation at the MP2 level. We also perform the first calculation on model aurophilic systems using the SCS-MP2 method and compare the results to high-accuracy CCSD(T) ones. The recently obtained high-resolution microwave spectra on MCN molecules (M=Cu, Ag, Au) provide an excellent testing ground for quantum chemistry. MP2 or CCSD(T) calculations, correlating all 19 valence electrons of Au and including BSSE and SO corrections, are able to give bond lengths to 0.6 pm, or better. Our calculated vibrational frequencies are expected to be better than the currently available experimental estimates. Qualitative evidence for multiple Au-C bonding in triatomic AuCN is also found.
Resumo:
NMR spectroscopy enables the study of biomolecules from peptides and carbohydrates to proteins at atomic resolution. The technique uniquely allows for structure determination of molecules in solution-state. It also gives insights into dynamics and intermolecular interactions important for determining biological function. Detailed molecular information is entangled in the nuclear spin states. The information can be extracted by pulse sequences designed to measure the desired molecular parameters. Advancement of pulse sequence methodology therefore plays a key role in the development of biomolecular NMR spectroscopy. A range of novel pulse sequences for solution-state NMR spectroscopy are presented in this thesis. The pulse sequences are described in relation to the molecular information they provide. The pulse sequence experiments represent several advances in NMR spectroscopy with particular emphasis on applications for proteins. Some of the novel methods are focusing on methyl-containing amino acids which are pivotal for structure determination. Methyl-specific assignment schemes are introduced for increasing the size range of 13C,15N labeled proteins amenable to structure determination without resolving to more elaborate labeling schemes. Furthermore, cost-effective means are presented for monitoring amide and methyl correlations simultaneously. Residual dipolar couplings can be applied for structure refinement as well as for studying dynamics. Accurate methods for measuring residual dipolar couplings in small proteins are devised along with special techniques applicable when proteins require high pH or high temperature solvent conditions. Finally, a new technique is demonstrated to diminish strong-coupling induced artifacts in HMBC, a routine experiment for establishing long-range correlations in unlabeled molecules. The presented experiments facilitate structural studies of biomolecules by NMR spectroscopy.
Resumo:
The study explores new ideational changes in the information strategy of the Finnish state between 1998 and 2007, after a juncture in Finnish governing in the early 1990s. The study scrutinizes the economic reframing of institutional openness in Finland that comes with significant and often unintended institutional consequences of transparency. Most notably, the constitutional principle of publicity (julkisuusperiaate), a Nordic institutional peculiarity allowing public access to state information, is now becoming an instrument of economic performance and accountability through results. Finland has a long institutional history in the publicity of government information, acknowledged by law since 1951. Nevertheless, access to government information became a policy concern in the mid-1990s, involving a historical narrative of openness as a Nordic tradition of Finnish governing Nordic openness (pohjoismainen avoimuus). International interest in transparency of governance has also marked an opening for institutional re-descriptions in Nordic context. The essential added value, or contradictory term, that transparency has on the Finnish conceptualisation of governing is the innovation that public acts of governing can be economically efficient. This is most apparent in the new attempts at providing standardised information on government and expressing it in numbers. In Finland, the publicity of government information has been a concept of democratic connotations, but new internationally diffusing ideas of performance and national economic competitiveness are discussed under the notion of transparency and its peer concepts openness and public (sector) information, which are also newcomers to Finnish vocabulary of governing. The above concepts often conflict with one another, paving the way to unintended consequences for the reforms conducted in their name. Moreover, the study argues that the policy concerns over openness and public sector information are linked to the new drive for transparency. Drawing on theories of new institutionalism, political economy, and conceptual history, the study argues for a reinvention of Nordic openness in two senses. First, in referring to institutional history, the policy discourse of Nordic openness discovers an administrative tradition in response to new dilemmas of public governance. Moreover, this normatively appealing discourse also legitimizes the new ideational changes. Second, a former mechanism of democratic accountability is being reframed with market and performance ideas, mostly originating from the sphere of transnational governance and governance indices. Mobilizing different research techniques and data (public documents of the Finnish government and international organizations, some 30 interviews of Finnish civil servants, and statistical time series), the study asks how the above ideational changes have been possible, pointing to the importance of nationalistically appealing historical narratives and normative concepts of governing. Concerning institutional developments, the study analyses the ideational changes in central steering mechanisms (political, normative and financial steering) and the introduction of budget transparency and performance management in two cases: census data (Population Register Centre) and foreign political information (Ministry for Foreign Affairs). The new policy domain of governance indices is also explored as a type of transparency. The study further asks what institutional transformations are to be observed in the above cases and in the accountability system. The study concludes that while the information rights of citizens have been reinforced and recalibrated during the period under scrutiny, there has also been a conversion of institutional practices towards economic performance. As the discourse of Nordic openness has been rather unquestioned, the new internationally circulating ideas of transparency and the knowledge economy have entered this discourse without public notice. Since the mid 1990s, state registry data has been perceived as an exploitable economic resource in Finland and in the EU public sector information. This is a parallel development to the new drive for budget transparency in organisations as vital to the state as the Population Register Centre, which has led to marketization of census data in Finland, an international exceptionality. In the Finnish Ministry for Foreign Affairs, the post-Cold War rhetorical shift from secrecy to performance-driven openness marked a conversion in institutional practices that now see information services with high regards. But this has not necessarily led to the increased publicity of foreign political information. In this context, openness is also defined as sharing information with select actors, as a trust based non-public activity, deemed necessary amid the global economic competition. Regarding accountability system, deliberation and performance now overlap, making it increasingly difficult to identify to whom and for what the public administration is accountable. These evolving institutional practices are characterised by unintended consequences and paradoxes. History is a paradoxical component in the above institutional change, as long-term institutional developments now justify short-term reforms.
Resumo:
Data assimilation provides an initial atmospheric state, called the analysis, for Numerical Weather Prediction (NWP). This analysis consists of pressure, temperature, wind, and humidity on a three-dimensional NWP model grid. Data assimilation blends meteorological observations with the NWP model in a statistically optimal way. The objective of this thesis is to describe methodological development carried out in order to allow data assimilation of ground-based measurements of the Global Positioning System (GPS) into the High Resolution Limited Area Model (HIRLAM) NWP system. Geodetic processing produces observations of tropospheric delay. These observations can be processed either for vertical columns at each GPS receiver station, or for the individual propagation paths of the microwave signals. These alternative processing methods result in Zenith Total Delay (ZTD) and Slant Delay (SD) observations, respectively. ZTD and SD observations are of use in the analysis of atmospheric humidity. A method is introduced for estimation of the horizontal error covariance of ZTD observations. The method makes use of observation minus model background (OmB) sequences of ZTD and conventional observations. It is demonstrated that the ZTD observation error covariance is relatively large in station separations shorter than 200 km, but non-zero covariances also appear at considerably larger station separations. The relatively low density of radiosonde observing stations limits the ability of the proposed estimation method to resolve the shortest length-scales of error covariance. SD observations are shown to contain a statistically significant signal on the asymmetry of the atmospheric humidity field. However, the asymmetric component of SD is found to be nearly always smaller than the standard deviation of the SD observation error. SD observation modelling is described in detail, and other issues relating to SD data assimilation are also discussed. These include the determination of error statistics, the tuning of observation quality control and allowing the taking into account of local observation error correlation. The experiments made show that the data assimilation system is able to retrieve the asymmetric information content of hypothetical SD observations at a single receiver station. Moreover, the impact of real SD observations on humidity analysis is comparable to that of other observing systems.
Resumo:
The output of a laser is a high frequency propagating electromagnetic field with superior coherence and brightness compared to that emitted by thermal sources. A multitude of different types of lasers exist, which also translates into large differences in the properties of their output. Moreover, the characteristics of the electromagnetic field emitted by a laser can be influenced from the outside, e.g., by injecting an external optical field or by optical feedback. In the case of free-running solitary class-B lasers, such as semiconductor and Nd:YVO4 solid-state lasers, the phase space is two-dimensional, the dynamical variables being the population inversion and the amplitude of the electromagnetic field. The two-dimensional structure of the phase space means that no complex dynamics can be found. If a class-B laser is perturbed from its steady state, then the steady state is restored after a short transient. However, as discussed in part (i) of this Thesis, the static properties of class-B lasers, as well as their artificially or noise induced dynamics around the steady state, can be experimentally studied in order to gain insight on laser behaviour, and to determine model parameters that are not known ab initio. In this Thesis particular attention is given to the linewidth enhancement factor, which describes the coupling between the gain and the refractive index in the active material. A highly desirable attribute of an oscillator is stability, both in frequency and amplitude. Nowadays, however, instabilities in coupled lasers have become an active area of research motivated not only by the interesting complex nonlinear dynamics but also by potential applications. In part (ii) of this Thesis the complex dynamics of unidirectionally coupled, i.e., optically injected, class-B lasers is investigated. An injected optical field increases the dimensionality of the phase space to three by turning the phase of the electromagnetic field into an important variable. This has a radical effect on laser behaviour, since very complex dynamics, including chaos, can be found in a nonlinear system with three degrees of freedom. The output of the injected laser can be controlled in experiments by varying the injection rate and the frequency of the injected light. In this Thesis the dynamics of unidirectionally coupled semiconductor and Nd:YVO4 solid-state lasers is studied numerically and experimentally.