72 resultados para parser


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents an overview of the MPEG-7 Description Definition Language (DDL). The DDL provides the syntactic rules for creating, combining, extending and refining MPEG-7 Descriptors (Ds) and Description Schemes (DSs), In the interests of interoperability, the W3C's XML Schema language, with the addition of certain MPEG-7-specific extensions, has been chosen as the DDL. This paper describes the background to this decision and using examples, provides an overview of the core XML, schema features used within MPEG-7 and the extensions made in order to satisfy the MPEG-7 DDL requirements.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Incremental parsing has long been recognized as a technique of great utility in the construction of language-based editors, and correspondingly, the area currently enjoys a mature theory. Unfortunately, many practical considerations have been largely overlooked in previously published algorithms. Many user requirements for an editing system necessarily impact on the design of its incremental parser, but most approaches focus only on one: response time. This paper details an incremental parser based on LR parsing techniques and designed for use in a modeless syntax recognition editor. The nature of this editor places significant demands on the structure and quality of the document representation it uses, and hence, on the parser. The strategy presented here is novel in that both the parser and the representation it constructs are tolerant of the inevitable and frequent syntax errors that arise during editing. This is achieved by a method that differs from conventional error repair techniques, and that is more appropriate for use in an interactive context. Furthermore, the parser aims to minimize disturbance to this representation, not only to ensure other system components can operate incrementally, but also to avoid unfortunate consequences for certain user-oriented services. The algorithm is augmented with a limited form of predictive tree-building, and a technique is presented for the determination of valid symbols for menu-based insertion. Copyright (C) 2001 John Wiley & Sons, Ltd.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia Mecânica

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Computerized scheduling methods and computerized scheduling systems according to exemplary embodiments. A computerized scheduling method may be stored in a memory and executed on one or more processors. The method may include defining a main multi-machine scheduling problem as a plurality of single machine scheduling problems; independently solving the plurality of single machine scheduling problems thereby calculating a plurality of near optimal single machine scheduling problem solutions; integrating the plurality of near optimal single machine scheduling problem solutions into a main multi-machine scheduling problem solution; and outputting the main multi-machine scheduling problem solution.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, a module for homograph disambiguation in Portuguese Text-to-Speech (TTS) is proposed. This module works with a part-of-speech (POS) parser, used to disambiguate homographs that belong to different parts-of-speech, and a semantic analyzer, used to disambiguate homographs which belong to the same part-of-speech. The proposed algorithms are meant to solve a significant part of homograph ambiguity in European Portuguese (EP) (106 homograph pairs so far). This system is ready to be integrated in a Letter-to-Sound (LTS) converter. The algorithms were trained and tested with different corpora. The obtained experimental results gave rise to 97.8% of accuracy rate. This methodology is also valid for Brazilian Portuguese (BP), since 95 homographs pairs are exactly the same as in EP. A comparison with a probabilistic approach was also done and results were discussed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El objetivo fundamental de este proyecto consiste en crear un generador de compilador, basado en analizadores ascendentes. Como base para hacer este analizador se usará el lenguaje Cosel y el módulo Com, que es un generador de compiladores basado en analizadores descendentes y que actualmente se está utilizando en las prácticas de la asignatura de Compiladores I. El nuevo generador, que tiene como entrada una gramática, ha de comprobar si es una gramática ascendente LALR (1) y analizar una cadena de entrada de símbolos usando dicha gramática.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Final Year Project consists of two essentially different parts, which share acommon theme: HTML code validation. The first of these two parts focuses on the study of the validation process. It supplies a brief introduction to the evolution of HTML and XHTML, the new tags introduced in HTML5 and the most common errors found in today's websites. Already developed HTML validation tools are analyzed and examined in detail in order to compare their features and evaluate their performances. Lastly, a comparison of the parsing process in the most common browsers found nowadays is provided. In the second part of the project the focus of the project is shifted towards the development of a XHTML5 validation tool. The input is a XHTML5 file whose content may or may not comply with the W3C specification, and therefore, may or may not be a valid XHTML5 document. The output provided by this tool will be a fixed XHTML5 document and an error log returned in the form of a XML file. Information as to the course of action pursued to fix the error and its location will also be included.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El objetivo de este proyecto es familiarizarse con las tecnologías de Semántica, entender que es una ontología y aprender a modelar una en un dominio elegido por nosotros. Realizar un parser que conectándose a la la Wikipedia y/o DBpedia rellene dicha ontología permitiendo al usuario navegar por sus conceptos y estudiar sus relaciones.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Työssä tutkittiin IFC (Industrial Foundation Classes)-tietomallin mukaisen tiedoston jäsentämistä, tiedon jatkoprosessointia ja tiedonsiirtoa sovellusten välillä. Tutkittiin, mitä vaihtoehtoja tiedon siirron toteuttamiseksi ohjelmallisesti on ja mihin suuntaan tiedon siirtäminen on menossa tulevaisuudessa. Soveltavassa osassa toteutettiin IFC-standardin mukaisen ISO10303-tiedoston (Osa 21) jäsentäminen ja tulkitseminen XML-muotoon. Sovelluksessa jäsennetään ja tulkitaan CAD-ohjelmistolla tehty IFC-tiedosto C# -ohjelmointikielellä ja tallennetaan tieto XML-tietokantaan kustannuslaskentaohjelmiston luettavaksi.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Biomedical research is currently facing a new type of challenge: an excess of information, both in terms of raw data from experiments and in the number of scientific publications describing their results. Mirroring the focus on data mining techniques to address the issues of structured data, there has recently been great interest in the development and application of text mining techniques to make more effective use of the knowledge contained in biomedical scientific publications, accessible only in the form of natural human language. This thesis describes research done in the broader scope of projects aiming to develop methods, tools and techniques for text mining tasks in general and for the biomedical domain in particular. The work described here involves more specifically the goal of extracting information from statements concerning relations of biomedical entities, such as protein-protein interactions. The approach taken is one using full parsing—syntactic analysis of the entire structure of sentences—and machine learning, aiming to develop reliable methods that can further be generalized to apply also to other domains. The five papers at the core of this thesis describe research on a number of distinct but related topics in text mining. In the first of these studies, we assessed the applicability of two popular general English parsers to biomedical text mining and, finding their performance limited, identified several specific challenges to accurate parsing of domain text. In a follow-up study focusing on parsing issues related to specialized domain terminology, we evaluated three lexical adaptation methods. We found that the accurate resolution of unknown words can considerably improve parsing performance and introduced a domain-adapted parser that reduced the error rate of theoriginal by 10% while also roughly halving parsing time. To establish the relative merits of parsers that differ in the applied formalisms and the representation given to their syntactic analyses, we have also developed evaluation methodology, considering different approaches to establishing comparable dependency-based evaluation results. We introduced a methodology for creating highly accurate conversions between different parse representations, demonstrating the feasibility of unification of idiverse syntactic schemes under a shared, application-oriented representation. In addition to allowing formalism-neutral evaluation, we argue that such unification can also increase the value of parsers for domain text mining. As a further step in this direction, we analysed the characteristics of publicly available biomedical corpora annotated for protein-protein interactions and created tools for converting them into a shared form, thus contributing also to the unification of text mining resources. The introduced unified corpora allowed us to perform a task-oriented comparative evaluation of biomedical text mining corpora. This evaluation established clear limits on the comparability of results for text mining methods evaluated on different resources, prompting further efforts toward standardization. To support this and other research, we have also designed and annotated BioInfer, the first domain corpus of its size combining annotation of syntax and biomedical entities with a detailed annotation of their relationships. The corpus represents a major design and development effort of the research group, with manual annotation that identifies over 6000 entities, 2500 relationships and 28,000 syntactic dependencies in 1100 sentences. In addition to combining these key annotations for a single set of sentences, BioInfer was also the first domain resource to introduce a representation of entity relations that is supported by ontologies and able to capture complex, structured relationships. Part I of this thesis presents a summary of this research in the broader context of a text mining system, and Part II contains reprints of the five included publications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dynamic logic is an extension of modal logic originally intended for reasoning about computer programs. The method of proving correctness of properties of a computer program using the well-known Hoare Logic can be implemented by utilizing the robustness of dynamic logic. For a very broad range of languages and applications in program veri cation, a theorem prover named KIV (Karlsruhe Interactive Veri er) Theorem Prover has already been developed. But a high degree of automation and its complexity make it di cult to use it for educational purposes. My research work is motivated towards the design and implementation of a similar interactive theorem prover with educational use as its main design criteria. As the key purpose of this system is to serve as an educational tool, it is a self-explanatory system that explains every step of creating a derivation, i.e., proving a theorem. This deductive system is implemented in the platform-independent programming language Java. In addition, a very popular combination of a lexical analyzer generator, JFlex, and the parser generator BYacc/J for parsing formulas and programs has been used.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Classical relational databases lack proper ways to manage certain real-world situations including imprecise or uncertain data. Fuzzy databases overcome this limitation by allowing each entry in the table to be a fuzzy set where each element of the corresponding domain is assigned a membership degree from the real interval [0…1]. But this fuzzy mechanism becomes inappropriate in modelling scenarios where data might be incomparable. Therefore, we become interested in further generalization of fuzzy database into L-fuzzy database. In such a database, the characteristic function for a fuzzy set maps to an arbitrary complete Brouwerian lattice L. From the query language perspectives, the language of fuzzy database, FSQL extends the regular Structured Query Language (SQL) by adding fuzzy specific constructions. In addition to that, L-fuzzy query language LFSQL introduces appropriate linguistic operations to define and manipulate inexact data in an L-fuzzy database. This research mainly focuses on defining the semantics of LFSQL. However, it requires an abstract algebraic theory which can be used to prove all the properties of, and operations on, L-fuzzy relations. In our study, we show that the theory of arrow categories forms a suitable framework for that. Therefore, we define the semantics of LFSQL in the abstract notion of an arrow category. In addition, we implement the operations of L-fuzzy relations in Haskell and develop a parser that translates algebraic expressions into our implementation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Lattice valued fuzziness is more general than crispness or fuzziness based on the unit interval. In this work, we present a query language for a lattice based fuzzy database. We define a Lattice Fuzzy Structured Query Language (LFSQL) taking its membership values from an arbitrary lattice L. LFSQL can handle, manage and represent crisp values, linear ordered membership degrees and also allows membership degrees from lattices with non-comparable values. This gives richer membership degrees, and hence makes LFSQL more flexible than FSQL or SQL. In order to handle vagueness or imprecise information, every entry into an L-fuzzy database is an L-fuzzy set instead of crisp values. All of this makes LFSQL an ideal query language to handle imprecise data where some factors are non-comparable. After defining the syntax of the language formally, we provide its semantics using L-fuzzy sets and relations. The semantics can be used in future work to investigate concepts such as functional dependencies. Last but not least, we present a parser for LFSQL implemented in Haskell.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work is aimed at building an adaptable frame-based system for processing Dravidian languages. There are about 17 languages in this family and they are spoken by the people of South India.Karaka relations are one of the most important features of Indian languages. They are the semabtuco-syntactic relations between verbs and other related constituents in a sentence. The karaka relations and surface case endings are analyzed for meaning extraction. This approach is comparable with the borad class of case based grammars.The efficiency of this approach is put into test in two applications. One is machine translation and the other is a natural language interface (NLI) for information retrieval from databases. The system mainly consists of a morphological analyzer, local word grouper, a parser for the source language and a sentence generator for the target language. This work make contributios like, it gives an elegant account of the relation between vibhakthi and karaka roles in Dravidian languages. This mapping is elegant and compact. The same basic thing also explains simple and complex sentence in these languages. This suggests that the solution is not just ad hoc but has a deeper underlying unity. This methodology could be extended to other free word order languages. Since the frame designed for meaning representation is general, they are adaptable to other languages coming in this group and to other applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis summarizes the results on the studies on a syntax based approach for translation between Malayalam, one of Dravidian languages and English and also on the development of the major modules in building a prototype machine translation system from Malayalam to English. The development of the system is a pioneering effort in Malayalam language unattempted by previous researchers. The computational models chosen for the system is first of its kind for Malayalam language. An in depth study has been carried out in the design of the computational models and data structures needed for different modules: morphological analyzer , a parser, a syntactic structure transfer module and target language sentence generator required for the prototype system. The generation of list of part of speech tags, chunk tags and the hierarchical dependencies among the chunks required for the translation process also has been done. In the development process, the major goals are: (a) accuracy of translation (b) speed and (c) space. Accuracy-wise, smart tools for handling transfer grammar and translation standards including equivalent words, expressions, phrases and styles in the target language are to be developed. The grammar should be optimized with a view to obtaining a single correct parse and hence a single translated output. Speed-wise, innovative use of corpus analysis, efficient parsing algorithm, design of efficient Data Structure and run-time frequency-based rearrangement of the grammar which substantially reduces the parsing and generation time are required. The space requirement also has to be minimised