960 resultados para LFG grammar and parsing
Resumo:
For more than forty years, research has been on going in the use of the computer in the processing of natural language. During this period methods have evolved, with various parsing techniques and grammars coming to prominence. Problems still exist, not least in the field of Machine Translation. However, one of the successes in this field is the translation of sublanguage. The present work reports Deterministic Parsing, a relatively new parsing technique, and its application to the sublanguage of an aircraft maintenance manual for Machine Translation. The aim has been to investigate the practicability of using Deterministic Parsers in the analysis stage of a Machine Translation system. Machine Translation, Sublanguage and parsing are described in general terms with a review of Deterministic parsing systems, pertinent to this research, being presented in detail. The interaction between machine Translation, Sublanguage and Parsing, including Deterministic parsing, is also highlighted. Two types of Deterministic Parser have been investigated, a Marcus-type parser, based on the basic design of the original Deterministic parser (Marcus, 1980) and an LR-type Deterministic Parser for natural language, based on the LR parsing algorithm. In total, four Deterministic Parsers have been built and are described in the thesis. Two of the Deterministic Parsers are prototypes from which the remaining two parsers to be used on sublanguage have been developed. This thesis reports the results of parsing by the prototypes, a Marcus-type parser and an LR-type parser which have a similar grammatical and linguistic range to the original Marcus parser. The Marcus-type parser uses a grammar of production rules, whereas the LR-type parser employs a Definite Clause Grammar(DGC).
Resumo:
This paper examines the beliefs and practices about the integration of grammar and skills teaching reported by 176 English language teachers from 18 countries. Teachers completed a questionnaire which elicited beliefs about grammar teaching generally as well as specific beliefs and reported practices about the integration of grammar and skills teaching. Teachers expressed strong beliefs in the need to avoid teaching grammar in isolation and reported high levels of integration of grammar in their practices. This study also examines how teachers conceptualize integration and the sources of evidence they draw on in assessing the effectiveness of their instructional practices in teaching grammar. The major findings for this paper stem from an analysis of these two issues. A range of ways in which teachers understood integration are identified and classified into two broad orientations which we label temporal and contextual. An analysis of the evidence which teachers cited in making judgements about the effectiveness of their grammar teaching practices showed that it was overwhelmingly practical and experiential and did not refer in any explicit way to second language acquisition theory. Given the volume of available theory about L2 grammar teaching generally and integration specifically, the lack of direct reference to such evidence in teachers’ accounts is noteworthy.
Resumo:
This study examined university students' writing skills as perceived by university students and their English instructors. The goal of the study was to provide English instructors with objective, quantified information about writing perceptions from both the students' and instructors' viewpoints. ^ A survey instrument was developed based on a survey instrument created by Newkirk, Cameron, and Selfe (1977) to identify instructors' perceived knowledge of student writing skills. The present study used a descriptive statistical design. It examined five writing skill areas: attitude, content, grammar and mechanics, literary considerations, and the writing process through a questionnaire completed by a convenience sample of summer and fall admitted freshmen who were enrolled in Essay Writing and Freshman Composition courses and English Department instructors at a large South Florida public university. ^ The study consisted of five phases. The first phase was modifying of the Newkirk, Cameron, and Selfe (1977) questionnaire. Two versions of the revised survey were developed - one for instructors and one for students. The second phase was pilot testing the questionnaire for evaluation of administration and scoring. The third phase was administering the questionnaire to 1,280 students and 48 instructors. The fourth phase was analyzing the data. The study found a significant difference in the perceptions of students and instructors in all areas of writing skills examined by the survey. Responses to 29 of 30 questions showed that students felt they had better attitudes toward writing and better writing skills than instructors thought. ^ The final phase was developing recommendations for practice. Based on findings and theory and empirical evidence drawn from the fields of adult education and composition research, learner-centered, self-directed curriculum guidelines are offered. ^ By objectively quantifying student and instructor perceptions of students' writing skills, this study contributes to a growing body of literature that: (a) encourages instructors to acknowledge the perception disparities between instructors and students; (b) gives instructors a better understanding of how to communicate with students; and (c) recommends the development of new curriculum, placement tests, and courses that meet the needs of students and enables English instructors to provide meaningful instruction. ^
Resumo:
The increasing amount of available semistructured data demands efficient mechanisms to store, process, and search an enormous corpus of data to encourage its global adoption. Current techniques to store semistructured documents either map them to relational databases, or use a combination of flat files and indexes. These two approaches result in a mismatch between the tree-structure of semistructured data and the access characteristics of the underlying storage devices. Furthermore, the inefficiency of XML parsing methods has slowed down the large-scale adoption of XML into actual system implementations. The recent development of lazy parsing techniques is a major step towards improving this situation, but lazy parsers still have significant drawbacks that undermine the massive adoption of XML. ^ Once the processing (storage and parsing) issues for semistructured data have been addressed, another key challenge to leverage semistructured data is to perform effective information discovery on such data. Previous works have addressed this problem in a generic (i.e. domain independent) way, but this process can be improved if knowledge about the specific domain is taken into consideration. ^ This dissertation had two general goals: The first goal was to devise novel techniques to efficiently store and process semistructured documents. This goal had two specific aims: We proposed a method for storing semistructured documents that maps the physical characteristics of the documents to the geometrical layout of hard drives. We developed a Double-Lazy Parser for semistructured documents which introduces lazy behavior in both the pre-parsing and progressive parsing phases of the standard Document Object Model’s parsing mechanism. ^ The second goal was to construct a user-friendly and efficient engine for performing Information Discovery over domain-specific semistructured documents. This goal also had two aims: We presented a framework that exploits the domain-specific knowledge to improve the quality of the information discovery process by incorporating domain ontologies. We also proposed meaningful evaluation metrics to compare the results of search systems over semistructured documents. ^
Resumo:
The increasing amount of available semistructured data demands efficient mechanisms to store, process, and search an enormous corpus of data to encourage its global adoption. Current techniques to store semistructured documents either map them to relational databases, or use a combination of flat files and indexes. These two approaches result in a mismatch between the tree-structure of semistructured data and the access characteristics of the underlying storage devices. Furthermore, the inefficiency of XML parsing methods has slowed down the large-scale adoption of XML into actual system implementations. The recent development of lazy parsing techniques is a major step towards improving this situation, but lazy parsers still have significant drawbacks that undermine the massive adoption of XML. Once the processing (storage and parsing) issues for semistructured data have been addressed, another key challenge to leverage semistructured data is to perform effective information discovery on such data. Previous works have addressed this problem in a generic (i.e. domain independent) way, but this process can be improved if knowledge about the specific domain is taken into consideration. This dissertation had two general goals: The first goal was to devise novel techniques to efficiently store and process semistructured documents. This goal had two specific aims: We proposed a method for storing semistructured documents that maps the physical characteristics of the documents to the geometrical layout of hard drives. We developed a Double-Lazy Parser for semistructured documents which introduces lazy behavior in both the pre-parsing and progressive parsing phases of the standard Document Object Model's parsing mechanism. The second goal was to construct a user-friendly and efficient engine for performing Information Discovery over domain-specific semistructured documents. This goal also had two aims: We presented a framework that exploits the domain-specific knowledge to improve the quality of the information discovery process by incorporating domain ontologies. We also proposed meaningful evaluation metrics to compare the results of search systems over semistructured documents.
Resumo:
Coral reefs are increasingly threatened by global and local anthropogenic stressors, such as rising seawater temperature and nutrient enrichment. These two stressors vary widely across the reef face and parsing out their influence on coral communities at reef system scales has been particularly challenging. Here, we investigate the influence of temperature and nutrients on coral community traits and life history strategies on lagoonal reefs across the Belize Mesoamerican Barrier Reef System (MBRS). A novel metric was developed using ultra-high-resolution sea surface temperatures (SST) to classify reefs as enduring low (lowTP), moderate (modTP), or extreme (extTP) temperature parameters over 10 years (2003 to 2012). Chlorophyll-a (chl a) records obtained for the same interval were employed as a proxy for bulk nutrients and these records were complemented with in situ measurements to "sea truth" nutrient content across the three reef types. Chl a concentrations were highest at extTP sites, medial at modTP sites and lowest at lowTP sites. Coral species richness, abundance, diversity, density, and percent cover were lower at extTP sites compared to lowTP and modTP sites, but these reef community traits did not differ between lowTP and modTP sites. Coral life history strategy analyses showed that extTP sites were dominated by hardy stress-tolerant and fast-growing weedy coral species, while lowTP and modTP sites consisted of competitive, generalist, weedy, and stress-tolerant coral species. These results suggest that differences in coral community traits and life history strategies between extTP and lowTP/modTP sites were driven primarily by temperature differences with differences in nutrients across site types playing a lesser role. Dominance of weedy and stress-tolerant genera at extTP sites suggests that corals utilizing these two life history strategies may be better suited to cope with warmer oceans and thus may warrant further protective status during this climate change interval.
Data associated with this project are archived here, including:
-SST data
-Satellite Chl a data
-Nutrient measurements
-Raw coral community survey data
For questions contact Justin Baumann (j.baumann3
Resumo:
This paper is a tutorial on defining recursive descent parsers in Haskell. In the spirit of one-stop shopping, the paper combines material from three areas into a single source. The three areas are functional parsers, the use of monads to structure functional programs, and the use of special syntax for monadic programs in Haskell. More specifically, the paper shows how to define monadic parsers using do notation in Haskell. The paper is targeted at the level of a good undergraduate student who is familiar with Haskell, and has completed a grammars and parsing course. Some knowledge of functional parsers would be useful, but no experience with monads is assumed.
Resumo:
In the last sixty years a steadily maintained process of convergence towards the Castilian national standard has been occurring in Southern Spain affecting urban middle-class speakers’ varieties, particularly phonology and lexis. As a consequence, unmarked features characterising innovative southern pronunciation have become less frequent and, at the same time, certain standard marked features have been adapted to the southern phonemic inventory. Then, urban middle-class varieties have progressively been stretching out the distance separating them from working-class and rural varieties, and bringing them closer to central Castilian varieties. Intermediate, yet incipient koineised varieties have been described including also transitional Murcia and Extremadura dialects (Hernández & Villena 2009, Villena, Vida & von Essen 2015). (1) Some of the standard phonologically marked features have spread out among southern speakers exclusively based on their mainstream social prestige and producing not only changes in obstruent phoneme inventory –i.e. acquisition of /s/ vs. /θ/ contrast, but also standstill and even reversion of old consonant push- or pull-chain shifts –e.g. /h/ or /d/ fortition, affricate /ʧ/, etc. as well as traditional lexis shift (Villena et al. 2016). Internal (grammar and word frequency) and external (stratification, network and style) factors constraining those features follow similar patterns in the Andalusian speech communities analysed so far (Granada, Malaga) but when we zoom in on central varieties, which are closer to the national standard and then more conservative, differences in frequency increase and conflict sites emerge. (2) Unmarked ‘natural’ phonological features characterising southern dialects, particularly deletion of syllable-final consonant, do not keep pace with this trend of convergence towards the standard. Thus a combination of southern innovative syllable-final and standard conservative onset-consonant features coexist. (3). The main idea is that this intermediate variety is formed through changes suggesting that Andalusian speakers look for the best way of accepting marked prestige features without altering coherence within their inventory. Either reorganisation of the innovative phonemic system in such a way that it may include Castilian and standard /s/ vs. /θ/ contrast or re-syllabification of aspirated /s/ before dental stop are excellent examples of how and why linguistic features are able to integrate intermediate varieties between the dialect-standard continuum.
Resumo:
Dissertação de Mestrado, Ciências da Linguagem, Faculdade de Ciências Humanas e Sociais, Universidade do Algarve, 2016
Resumo:
Intelligent systems are currently inherent to the society, supporting a synergistic human-machine collaboration. Beyond economical and climate factors, energy consumption is strongly affected by the performance of computing systems. The quality of software functioning may invalidate any improvement attempt. In addition, data-driven machine learning algorithms are the basis for human-centered applications, being their interpretability one of the most important features of computational systems. Software maintenance is a critical discipline to support automatic and life-long system operation. As most software registers its inner events by means of logs, log analysis is an approach to keep system operation. Logs are characterized as Big data assembled in large-flow streams, being unstructured, heterogeneous, imprecise, and uncertain. This thesis addresses fuzzy and neuro-granular methods to provide maintenance solutions applied to anomaly detection (AD) and log parsing (LP), dealing with data uncertainty, identifying ideal time periods for detailed software analyses. LP provides deeper semantics interpretation of the anomalous occurrences. The solutions evolve over time and are general-purpose, being highly applicable, scalable, and maintainable. Granular classification models, namely, Fuzzy set-Based evolving Model (FBeM), evolving Granular Neural Network (eGNN), and evolving Gaussian Fuzzy Classifier (eGFC), are compared considering the AD problem. The evolving Log Parsing (eLP) method is proposed to approach the automatic parsing applied to system logs. All the methods perform recursive mechanisms to create, update, merge, and delete information granules according with the data behavior. For the first time in the evolving intelligent systems literature, the proposed method, eLP, is able to process streams of words and sentences. Essentially, regarding to AD accuracy, FBeM achieved (85.64+-3.69)%; eGNN reached (96.17+-0.78)%; eGFC obtained (92.48+-1.21)%; and eLP reached (96.05+-1.04)%. Besides being competitive, eLP particularly generates a log grammar, and presents a higher level of model interpretability.
Resumo:
A BAMoL (Business Application Modeling Language) é uma linguagem de domínio específico utilizada para o desenvolvimento de soluções para a plataforma myMIS, no âmbito dos sistemas de informação para a gestão. Esta linguagem carecia de dois aspetos, nomeadamente a sua formalização e a existência de mecanismos de validação sintática das soluções desenvolvidas. Estes problemas identificados tornam impossível a validação sintática das soluções desenvolvidas utilizando esta linguagem, aumentando assim a probabilidade de existência de erros, podendo fazer com que as mesmas sejam mais ineficientes e podendo até trazer um aumento de custos de manutenção da plataforma. De forma a resolver os problemas enunciados, foi realizada, para o primeiro, uma descrição textual de todos os constituintes da linguagem e criada uma gramática representativa da mesma, em que constam todos os seus elementos e regras. No caso do segundo problema, a sua resolução passou pela criação de uma ferramenta que utiliza a gramática criada e que permite validar sintaticamente e encontrar as falhas das soluções desenvolvidas. Desta forma, passa a ser possível detetar os erros existentes nas soluções, permitindo assim à equipa de desenvolvimento ter maior controlo sobre as mesmas, podendo torná-las mais corretas, na perspetiva das regras da linguagem.
Resumo:
Tede de Doutoramento, na especialidade de Ciências Políticas apresentada à FDUNL
Resumo:
Tese de Doutoramento em Ciências da Educação (Especialidade em Literacias e Ensino do Português)
Resumo:
El objetivo fundamental de este proyecto consiste en crear un generador de compilador, basado en analizadores ascendentes. Como base para hacer este analizador se usará el lenguaje Cosel y el módulo Com, que es un generador de compiladores basado en analizadores descendentes y que actualmente se está utilizando en las prácticas de la asignatura de Compiladores I. El nuevo generador, que tiene como entrada una gramática, ha de comprobar si es una gramática ascendente LALR (1) y analizar una cadena de entrada de símbolos usando dicha gramática.
Resumo:
Els alumnes i les alumnes de primer cicle d'ESO s'enfronten amb les dificultats de la nova etapa i la necessitat de demostrar les seves competències lingüístiques, quant al redactat i la correcció en l'elaboració de textos diversos. Les noves tecnologies de la informació i la comunicació poden ajudar (i han de propiciar-ho) a desenvolupar aquestes capacitats, i han de ser un element motivador perquè els nois i les noies sentin la necessitat de crear produccions escrites pròpies, adequades lingüísticament, coherents i amb les idees ben organitzades.És necessari, a més, que l'avaluació d'aquestes tasques els serveixi per millorar l'expressió en llengua catalana, gramaticalment i ortogràficament, tot ajudant-los a establir sistemes d'autocorrecció i rectificació d'errors.L'exposició pública (en un mitjà com el d'internet) de les seves obres hauria de reblar l'aspecte de la motivació en la creació d'escrits, de manera que els aporti noves expectatives de comunicació i els animi a participar de l'espai virtual que els ofereix aquesta finestra oberta al món. El meu treball té la intenció de ser un projecte educatiu per aconseguir una millora en l'expressió escrita dels alumnes del primer curs d'ESO amb l'ajuda de les eines de les Tecnologies de la Informació i la Comunicació.