932 resultados para Lexical Database
Resumo:
In this paper we try to present how information technologies as tools for the creation of digital bilingual dictionaries can help the preservation of natural languages. Natural languages are an outstanding part of human cultural values and for that reason they should be preserved as part of the world cultural heritage. We describe our work on the bilingual lexical database supporting the Bulgarian-Polish Online dictionary. The main software tools for the web- presentation of the dictionary are shortly described. We focus our special attention on the presentation of verbs, the richest from a specific characteristics viewpoint linguistic category in Bulgarian.
Resumo:
The paper aims to represent a bilingual online dictionary as a useful tool helping preservation of the natural languages. The author focuses on the approach that was taken to develop compatible bilingual lexical database for the Bulgarian-Polish online dictionary. A formal model for the dictionary encoding is developed in accordance with the complex structures of the dictionary entries. These structures vary depending on the grammatical characteristics of Bulgarian headwords. The Web-application for presentation of the bilingual dictionary is also describred.
Resumo:
L'objectiu d'aquest article és presentar l'estructura de la base de dades relacional que inclou tota la informació sintictica continguda en el Diccionario Critico Etimológico Castellano e Hispánico de J. Corominas i J. A. Pascual. Tot i que aquest diccionari conté un ampli ventall d'informacions històriques de cadascun dels temes, aquestes no es mostren de forma estructurada, per la qual cosa ha estat necessari estudiar i classificar tots aquells elements relacionats amb aspectes sintàctics. És a partir d'aquest estudi previ que s'han elaborat els diferents camps de la base de dades, els quals s'agrupen en cinc blocs temàtics: informació lemàtica; gramatical; sintàctica; altres aspectes relacionats; i observacions o comentaris rellevants fets per l'investigador. Aquesta base de dades no només reprodueix els continguts del diccionari, sinó que inclou diferents camps interpretatius. Per aquesta raó, Syntax. dbf representa una eina de treball fonamental per a tots aquells investigadors interessats en la sintaxi diacrònica de l'espanyol
Resumo:
Greek speakers say "ovpa", Germans "schwanz'' and the French "queue'' to describe what English speakers call a 'tail', but all of these languages use a related form of 'two' to describe the number after one. Among more than 100 Indo-European languages and dialects, the words for some meanings (such as 'tail') evolve rapidly, being expressed across languages by dozens of unrelated words, while others evolve much more slowly-such as the number 'two', for which all Indo-European language speakers use the same related word-form(1). No general linguistic mechanism has been advanced to explain this striking variation in rates of lexical replacement among meanings. Here we use four large and divergent language corpora (English(2), Spanish(3), Russian(4) and Greek(5)) and a comparative database of 200 fundamental vocabulary meanings in 87 Indo-European languages(6) to show that the frequency with which these words are used in modern language predicts their rate of replacement over thousands of years of Indo-European language evolution. Across all 200 meanings, frequently used words evolve at slower rates and infrequently used words evolve more rapidly. This relationship holds separately and identically across parts of speech for each of the four language corpora, and accounts for approximately 50% of the variation in historical rates of lexical replacement. We propose that the frequency with which specific words are used in everyday language exerts a general and law-like influence on their rates of evolution. Our findings are consistent with social models of word change that emphasize the role of selection, and suggest that owing to the ways that humans use language, some words will evolve slowly and others rapidly across all languages.
Resumo:
An implementation of a Lexical Functional Grammar (LFG) natural language front-end to a database is presented, and its capabilities demonstrated by reference to a set of queries used in the Chat-80 system. The potential of LFG for such applications is explored. Other grammars previously used for this purpose are briefly reviewed and contrasted with LFG. The basic LFG formalism is fully described, both as to its syntax and semantics, and the deficiencies of the latter for database access application shown. Other current LFG implementations are reviewed and contrasted with the LFG implementation developed here specifically for database access. The implementation described here allows a natural language interface to a specific Prolog database to be produced from a set of grammar rule and lexical specifications in an LFG-like notation. In addition to this the interface system uses a simple database description to compile metadata about the database for later use in planning the execution of queries. Extensions to LFG's semantic component are shown to be necessary to produce a satisfactory functional analysis and semantic output for querying a database. A diverse set of natural language constructs are analysed using LFG and the derivation of Prolog queries from the F-structure output of LFG is illustrated. The functional description produced from LFG is proposed as sufficient for resolving many problems of quantification and attachment.
Resumo:
Different types of water bodies, including lakes, streams, and coastal marine waters, are often susceptible to fecal contamination from a range of point and nonpoint sources, and have been evaluated using fecal indicator microorganisms. The most commonly used fecal indicator is Escherichia coli, but traditional cultivation methods do not allow discrimination of the source of pollution. The use of triplex PCR offers an approach that is fast and inexpensive, and here enabled the identification of phylogroups. The phylogenetic distribution of E. coli subgroups isolated from water samples revealed higher frequencies of subgroups A1 and B23 in rivers impacted by human pollution sources, while subgroups D1 and D2 were associated with pristine sites, and subgroup B1 with domesticated animal sources, suggesting their use as a first screening for pollution source identification. A simple classification is also proposed based on phylogenetic subgroup distribution using the w-clique metric, enabling differentiation of polluted and unpolluted sites.
Resumo:
Despite a strong increase in research on seamounts and oceanic islands ecology and biogeography, many basic aspects of their biodiversity are still unknown. In the southwestern Atlantic, the Vitória-Trindade Seamount Chain (VTC) extends ca. 1,200 km offshore the Brazilian continental shelf, from the Vitória seamount to the oceanic islands of Trindade and Martin Vaz. For a long time, most of the biological information available regarded its islands. Our study presents and analyzes an extensive database on the VTC fish biodiversity, built on data compiled from literature and recent scientific expeditions that assessed both shallow to mesophotic environments. A total of 273 species were recorded, 211 of which occur on seamounts and 173 at the islands. New records for seamounts or islands include 191 reef fish species and 64 depth range extensions. The structure of fish assemblages was similar between islands and seamounts, not differing in species geographic distribution, trophic composition, or spawning strategies. Main differences were related to endemism, higher at the islands, and to the number of endangered species, higher at the seamounts. Since unregulated fishing activities are common in the region, and mining activities are expected to drastically increase in the near future (carbonates on seamount summits and metals on slopes), this unique biodiversity needs urgent attention and management.
Resumo:
This article presents a characterization of the lexical competence (vocabulary knowledge and use) of students learning to read in EFL in a public university in São Paulo state. Although vocabulary has been consistently cited as one of the EFL reader´s main source of difficulty, there is no data in the literature which shows the extent of the difficulties. The data for this study is part of a previous research, which investigates, from the perspective of an interactive model of reading, the relationship between lexical competence and EFL reading comprehension. Quantitative as well as qualitative data was considered. For this study, the quantitative data is the product of vocabulary tests of 49 subjects while the qualitative data comprises pause protocols of three subjects, with levels of reading ability ranging from good to poor, selected upon their performance in the quantitative study. A rich concept of vocabulary knowledge was adapted and used for the development of vocabulary tests and analysis of protocols. The results on both studies show, with a few exceptions, the lexical competence of the group to be vague and imprecise in two dimensions: quantitative (number of known words or vocabulary size) and qualitative (depth or width of this knowledge). Implications for the teaching of reading in a foreign context are discussed.
Resumo:
Considering the difficulties in finding good-quality images for the development and test of computer-aided diagnosis (CAD), this paper presents a public online mammographic images database free for all interested viewers and aimed to help develop and evaluate CAD schemes. The digitalization of the mammographic images is made with suitable contrast and spatial resolution for processing purposes. The broad recuperation system allows the user to search for different images, exams, or patient characteristics. Comparison with other databases currently available has shown that the presented database has a sufficient number of images, is of high quality, and is the only one to include a functional search system.
Resumo:
This article documents the addition of 229 microsatellite marker loci to the Molecular Ecology Resources Database. Loci were developed for the following species: Acacia auriculiformis x Acacia mangium hybrid, Alabama argillacea, Anoplopoma fimbria, Aplochiton zebra, Brevicoryne brassicae, Bruguiera gymnorhiza, Bucorvus leadbeateri, Delphacodes detecta, Tumidagena minuta, Dictyostelium giganteum, Echinogammarus berilloni, Epimedium sagittatum, Fraxinus excelsior, Labeo chrysophekadion, Oncorhynchus clarki lewisi, Paratrechina longicornis, Phaeocystis antarctica, Pinus roxburghii and Potamilus capax. These loci were cross-tested on the following species: Acacia peregrinalis, Acacia crassicarpa, Bruguiera cylindrica, Delphacodes detecta, Tumidagena minuta, Dictyostelium macrocephalum, Dictyostelium discoideum, Dictyostelium purpureum, Dictyostelium mucoroides, Dictyostelium rosarium, Polysphondylium pallidum, Epimedium brevicornum, Epimedium koreanum, Epimedium pubescens, Epimedium wushanese and Fraxinus angustifolia.
Resumo:
Much information on flavonoid content of Brazilian foods has already been obtained; however, this information is spread in scientific publications and non-published data. The objectives of this work were to compile and evaluate the quality of national flavonoid data according to the United States Department of Agriculture`s Data Quality Evaluation System (USDA-DQES) with few modifications, for future dissemination in the TBCA-USP (Brazilian Food Composition Database). For the compilation, the most abundant compounds in the flavonoid subclasses were considered (flavonols, flavones, isoflavones, flavanones, flavan-3-ols, and anthocyanidins) and the analysis of the compounds by HPLC was adopted as criteria for data inclusion. The evaluation system considers five categories, and the maximum score assigned to each category is 20. For each data, a confidence code (CC) was attributed (A, B, C and D), indicating the quality and reliability of the information. Flavonoid data (773) present in 197 Brazilian foods were evaluated. The CC ""C"" (as average) was attributed to 99% of the data and ""B"" (above average) to 1%. The main categories assigned low average scores were: number of samples; sampling plan and analytical quality control (average scores 2, 5 and 4, respectively). The analytical method category received an average score of 9. The category assigned the highest score was the sample handling (20 average). These results show that researchers need to be conscious about the importance of the number and plan of evaluated samples and the complete description and documentation of all the processes of methodology execution and analytical quality control. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
Foods that contain unavailable carbohydrates may lower the risks for some non-transmissible chronic diseases because of the potential benefits provided by the products of colonic fermentation. On the other hand, foods that are sources of available carbohydrates may have higher energy value and increase the post-prandial glycemic response. The biomarker glycemic index and the resulting glycemic load may be used to classify foods according to their potential to increase blood glucose. Information about glycemic index and glycemic load may be useful in diet therapy. Currently, food composition tables in Brazil do not provide data for individually analyzed carbohydrates even though some quality data are available in scientific publications. The objectives of this work were to produce and compile information about the concentration of individual carbohydrates in foods and their glycemic responses and to disseminate this information through the Brazilian Food Composition Database (TBCA-USP). The glycemic index and glycemic load of foods were evaluated in healthy individuals. Concentrations of available carbohydrates (soluble sugars and available starch) and unavailable carbohydrates (dietary fiber, resistant starch, beta-glucans, fructans) were quantified by official methods, and other national data were compiled. TBCA-USP (http://www.fcf.usp.br/tabela), which is used by professionals and the population in general, now offers both chemical and biological information for carbohydrates. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
Nine individuals with complex language deficits following left-hemisphere cortical lesions and a matched control group (n 5 9) performed speeded lexical decisions on the third word of auditory word triplets containing a lexical ambiguity. The critical conditions were concordant (e.g., coin–bank–money), discordant (e.g., river–bank–money), neutral (e.g., day–bank– money), and unrelated (e.g., river–day–money). Triplets were presented with an interstimulus interval (ISI) of 100 and 1250 ms. Overall, the left-hemisphere-damaged subjects appeared able to exhaustively access meanings for lexical ambiguities rapidly, but were unable to reduce the level of activation for contextually inappropriate meanings at both short and long ISIs, unlike control subjects. These findings are consistent with a disruption of the proposed role of the left hemisphere in selecting and suppressing meanings via contextual integration and a sparing of the right-hemisphere mechanisms responsible for maintaining alternative meanings.