950 resultados para Lexical units
Resumo:
panels provides a quick way to count the number of panels (groups) in a dataset and display some basic information about the sizes of the panels. Furthermore, -panels- can be used as a prefix command to other Stata commands to apply them to panel units instead of individual observations. This is useful, for example, if you want to compute frequency distributions or summary statistics for panel characteristics.
Resumo:
The high integration density of current nanometer technologies allows the implementation of complex floating-point applications in a single FPGA. In this work the intrinsic complexity of floating-point operators is addressed targeting configurable devices and making design decisions providing the most suitable performance-standard compliance trade-offs. A set of floating-point libraries composed of adder/subtracter, multiplier, divisor, square root, exponential, logarithm and power function are presented. Each library has been designed taking into account special characteristics of current FPGAs, and with this purpose we have adapted the IEEE floating-point standard (software-oriented) to a custom FPGA-oriented format. Extended experimental results validate the design decisions made and prove the usefulness of reducing the format complexity
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
Lexica and terminology databases play a vital role in many NLP applications, but currently most such resources are published in application-specific formats, or with custom access interfaces, leading to the problem that much of this data is in ‘‘data silos’’ and hence difficult to access. The Semantic Web and in particular the Linked Data initiative provide effective solutions to this problem, as well as possibilities for data reuse by inter-lexicon linking, and incorporation of data categories by dereferencable URIs. The Semantic Web focuses on the use of ontologies to describe semantics on the Web, but currently there is no standard for providing complex lexical information for such ontologies and for describing the relationship between the lexicon and the ontology. We present our model, lemon, which aims to address these gaps
Resumo:
A new method to analyze the influence of possible hysteresis cycles in devices employed for optical computing architectures is reported. A simple full adder structure is taken as the basis for this method. Single units, called optical programmable logic cells, previously reported by the authors, compose this structure. These cells employ, as basic devices, on-off and SEED-like components. Their hysteresis cycles have been modeled by numerical analysis. The influence of the different characteristic cycles is studied with respect to the obtained possible errors at the output. Two different approaches have been adopted. The first one shows the change in the arithmetic result output with respect to the different values and positions of the hysteresis cycle. The second one offers a similar result, but in a polar diagram where the total behavior of the system is better analyzed.
Resumo:
We proposed an optical communications system, based on a digital chaotic signal where the synchronization of chaos was the main objective, in some previous papers. In this paper we will extend this work. A way to add the digital data signal to be transmitted onto the chaotic signal and its correct reception, is the main objective. We report some methods to study the main characteristics of the resulting signal. The main problem with any real system is the presence of some retard between the times than the signal is generated at the emitter at the time when this signal is received. Any system using chaotic signals as a method to encrypt need to have the same characteristics in emitter and receiver. It is because that, this control of time is needed. A method to control, in real time the chaotic signals, is reported.
Resumo:
We studied the coastal zone of the Tavoliere di Puglia plain, (Puglia region, southern Italy) with the aim to recognize the main unconformities, and therefore, the unconformity-bounded stratigraphic units (UBSUs; Salvador 1987, 1994) forming its Quaternary sedimentary fill. Recognizing unconformities is particularly problematic in an alluvial plain, due to the difficulties in distinguishing the unconformities that bound the UBSUs. So far, the recognition of UBSUs in buried successions has been made mostly by using seismic profiles. Instead, in our case, the unavailability of the latter has prompted us to address the problem by developing a methodological protocol consisting of the following steps: I) geological survey in the field; II) draft of a preliminary geological setting based on the field-survey results; III) dating of 102 samples coming from a large number of boreholes and some outcropping sections by means of the amino acid racemization (AAR) method applied to ostracod shells and 14C dating, filtering of the ages and the selection of valid ages; IV) correction of the preliminary geological setting in the light of the numerical ages; definition of the final geological setting with UBSUs; identification of a ‘‘hypothetical’’ or ‘‘attributed time range’’ (HTR or ATR) for each UBSU, the former very wide and subject to a subsequent modification, the latter definitive; V) cross-checking between the numerical ages and/or other characteristics of the sedimentary bodies and/or the sea-level curves (with their effects on the sedimentary processes) in order to restrict also the hypothetical time ranges in the attributed time ranges. The successful application of AAR geochronology to ostracod shells relies on the fact that the ability of ostracods to colonize almost all environments constitutes a tool for correlation, and also allow the inclusion in the same unit of coeval sediments that differ lithologically and paleoenvironmentally. The treatment of the numerical ages obtained using the AAR method required special attention. The first filtering step was made by the laboratory (rejection criteria a and b). Then, the second filtering step was made by testing in the field the remaining ages. Among these, in fact, we never compared an age with a single preceding and/or following age; instead, we identified homogeneous groups of numerical ages consistent with their reciprocal stratigraphic position. This operation led to the rejection of further numerical ages that deviate erratically from a larger, homogeneous age population which fits well with its stratigraphic position (rejection criterion c). After all of the filtering steps, the valid ages that remained were used for the subdivision of the sedimentary sequences into UBSUs together with the lithological and paleoenvironmental criteria. The numerical ages allowed us, in the first instance, to recognize all of the age gaps between two consecutive samples. Next, we identified the level, in the sedimentary thickness that is between these two samples, that may represent the most suitable UBSU boundary based on its lithology and/or the paleoenvironment. The recognized units are: I) Coppa Nevigata sands (NEA), HTR: MIS 20–14, ATR: MIS 17–16; II) Argille subappennine (ASP), HTR: MIS 15–11, ATR: MIS 15–13; III) Coppa Nevigata synthem (NVI), HTR: MIS 13–8, ATR: MIS 12–11; IV) Sabbie di Torre Quarto (STQ), HTR: MIS 13–9.1, ATR: MIS 11; V) Amendola subsynthem (MLM1), HTR: MIS 12–10, ATR: MIS 11; VI) Undifferentiated continental unit (UCI), HTR: MIS 11–6.2, ATR: MIS 9.3–7.1; VII) Foggia synthem (TGF), ATR: MIS 6; VIII) Masseria Finamondo synthem (TPF), ATR: Upper Pleistocene; IX) Carapelle and Cervaro streams synthem (RPL), subdivided into: IXa) Incoronata subsynthem (RPL1), HTR: MIS 6–3; ATR: MIS 5–3; IXb) Marane La Pidocchiosa–Castello subsynthem (RPL3), ATR: Holocene; X) Masseria Inacquata synthem (NAQ), ATR: Holocene. The possibility of recognizing and dating Quaternary units in an alluvial plain to the scale of a marine isotope stage constitutes a clear step forward compared with similar studies regarding other alluvial-plain areas, where Quaternary units were dated almost exclusively using their stratigraphic position. As a result, they were generically associated with a geological sub-epoch. Instead, our method allowed a higher detail in the timing of the sedimentary processes: for example, MIS 11 and MIS 5.5 deposits have been recognized and characterized for the first time in the study area, highlighting their importance as phases of sedimentation.
Resumo:
The high performance and capacity of current FPGAs makes them suitable as acceleration co-processors. This article studies the implementation, for such accelerators, of the floating-point power function xy as defined by the C99 and IEEE 754-2008 standards, generalized here to arbitrary exponent and mantissa sizes. Last-bit accuracy at the smallest possible cost is obtained thanks to a careful study of the various subcomponents: a floating-point logarithm, a modified floating-point exponential, and a truncated floating-point multiplier. A parameterized architecture generator in the open-source FloPoCo project is presented in details and evaluated.
Resumo:
The study presented in this paper aims to provide a description of telecommunication blogs as a genre. Lexical phrases are analysed in order to reach conclusions regarding the nature of the language in these texts and the extent to which the results obtained are comparable with other written or conversational discourse types. Although the departing hypothesis is that the articles in blogs are basically transactional and their main objective is to transfer information, the conclusions point to interaction as a distinctive characteristic of this type of discourse and also to a more careful organization in the comments to the blogs entries than originally expected. RESUMEN. Este trabajo presenta una descripción de los blogs de Telecomunicación como género. A través del análisis de frases léxicas se llega a conclusiones sobre algunas de las características que definen la lengua que se usa en este tipo de textos y se comparan los resultados obtenidos con otros previos sobre el discurso escrito y el conversacional. Aunque la hipótesis de partida es que los artículos de los blogs son básicamente transaccionales, donde el principal objetivo es transmitir información, los resultados llevan a conclusiones sobre la importancia de la interacción en este tipo de discurso y también apuntan a una mayor organización de la esperable en las entradas de los comentarios al artículo principal del blog.
Resumo:
This paper examines two kinds of questions relating to the lexical needs of professional ESP students: (1) what range of terms and words do they need help with? (2) what types of dictionary, bilingual and/or monolingual, can they make use of in solving lexical problems?