904 resultados para Linguistic and extralinguistic variation
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
Early in the development of plant evolutionary biology, genetic drift, fluctuations in population size, and isolation were identified as critical processes that affect the course of evolution in plant species. Attempts to assess these processes in natural populations became possible only with the development of neutral genetic markers in the 1960s. More recently, the application of historically ordered neutral molecular variation (within the conceptual framework of coalescent theory) has allowed a reevaluation of these microevolutionary processes. Gene genealogies trace the evolutionary relationships among haplotypes (alleles) with populations. Processes such as selection, fluctuation in population size, and population substructuring affect the geographical and genealogical relationships among these alleles. Therefore, examination of these genealogical data can provide insights into the evolutionary history of a species. For example, studies of Arabidopsis thaliana have suggested that this species underwent rapid expansion, with populations showing little genetic differentiation. The new discipline of phylogeography examines the distribution of allele genealogies in an explicit geographical context. Phylogeographic studies of plants have documented the recolonization of European tree species from refugia subsequent to Pleistocene glaciation, and such studies have been instructive in understanding the origin and domestication of the crop cassava. Currently, several technical limitations hinder the widespread application of a genealogical approach to plant evolutionary studies. However, as these technical issues are solved, a genealogical approach holds great promise for understanding these previously elusive processes in plant evolution.
Resumo:
Current evidence on the long-term evolutionary effect of insertion of sequence elements into gene regions is reviewed, restricted to cases where a sequence derived from a past insertion participates in the regulation of expression of a useful gene. Ten such examples in eukaryotes demonstrate that segments of repetitive DNA or mobile elements have been inserted in the past in gene regions, have been preserved, sometimes modified by selection, and now affect control of transcription of the adjacent gene. Included are only examples in which transcription control was modified by the insert. Several cases in which merely transcription initiation occurred in the insert were set aside. Two of the examples involved the long terminal repeats of mammalian endogenous retroviruses. Another two examples were control of transcription by repeated sequence inserts in sea urchin genomes. There are now six published examples in which Alu sequences were inserted long ago into human gene regions, were modified, and now are central in control/enhancement of transcription. The number of published examples of Alu sequences affecting gene control has grown threefold in the last year and is likely to continue growing. Taken together, all of these examples show that the insertion of sequence elements in the genome has been a significant source of regulatory variation in evolution.
Resumo:
As additivity is a very useful property for a distance measure, a general additive distance is proposed under the stationary time-reversible (SR) model of nucleotide substitution or, more generally, under the stationary, time-reversible, and rate variable (SRV) model, which allows rate variation among nucleotide sites. A method for estimating the mean distance and the sampling variance is developed. In addition, a method is developed for estimating the variance-covariance matrix of distances, which is useful for the statistical test of phylogenies and molecular clocks. Computer simulation shows (i) if the sequences are longer than, say, 1000 bp, the SR method is preferable to simpler methods; (ii) the SR method is robust against deviations from time-reversibility; (iii) when the rate varies among sites, the SRV method is much better than the SR method because the distance is seriously underestimated by the SR method; and (iv) our method for estimating the sampling variance is accurate for sequences longer than 500 bp. Finally, a test is constructed for testing whether DNA evolution follows a general Markovian model.
Resumo:
Cover title.
Resumo:
Issued June 1979.
Resumo:
Title varies. 1st-3d series in "Trübner's oriental series."
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
We conducted a demographic and genetic study to investigate the effects of fragmentation due to the establishment of an exotic softwood plantation on populations of a small marsupial carnivore, the agile antechinus (Antechinus agilis), and the factors influencing the persistence of those populations in the fragmented habitat. The first aspect of the study was a descriptive analysis of patch occupancy and population size, in which we found a patch occupancy rate of 70% among 23 sites in the fragmented habitat compared to 100% among 48 sites with the same habitat characteristics in unfragmented habitat. Mark-recapture analyses yielded most-likely population size estimates of between 3 and 85 among the 16 occupied patches in the fragmented habitat. Hierarchical partitioning and model selection were used to identify geographic and habitat-related characteristics that influence patch occupancy and population size. Patch occupancy was primarily influenced by geographic isolation and habitat quality (vegetation basal area). The variance in population size among occupied sites was influenced primarily by forest type (dominant Eucalyptus species) and, to a lesser extent, by patch area and topographic context (gully sites had larger populations). A comparison of the sex ratios between the samples from the two habitat contexts revealed a significant deficiency of males in the fragmented habitat. We hypothesise that this is due to male-biased dispersal in an environment with increased dispersal-associated mortality. The population size and sex ratio data were incorporated into a simulation study to estimate the proportion of genetic diversity that would have been lost over the known timescale since fragmentation if the patch populations had been totally isolated. The observed difference in genetic diversity (gene diversity and allelic richness at microsatellite and mitochondrial markers) between 16 fragmented and 12 unfragmented sites was extremely low and inconsistent with the isolation of the patch populations. Our results show that although the remnant habitat patches comprise approximately 2% of the study area, they can support non-isolated populations. However, the distribution of agile antechinus populations in the fragmented system is dependent on habitat quality and patch connectivity. (C) 2004 Elsevier Ltd. All rights reserved.
Resumo:
We compared growth rates of the lemon shark, Negaprion brevirostris, from Bimini, Bahamas and the Marquesas Keys (MK), Florida using data obtained in a multi-year annual census. We marked new neonate and juvenile sharks with unique electronic identity tags in Bimini and in the MK we tagged neonate and juvenile sharks. Sharks were tagged with tiny, subcutaneous transponders, a type of tagging thought to cause little, if any disruption to normal growth patterns when compared to conventional external tagging. Within the first 2 years of this project, no age data were recorded for sharks caught for the first time in Bimini. Therefore, we applied and tested two methods of age analysis: ( 1) a modified 'minimum convex polygon' method and ( 2) a new age-assigning method, the 'cut-off technique'. The cut-off technique proved to be the more suitable one, enabling us to identify the age of 134 of the 642 previously unknown aged sharks. This maximised the usable growth data included in our analysis. Annual absolute growth rates of juvenile, nursery-bound lemon sharks were almost constant for the two Bimini nurseries and can be best described by a simple linear model ( growth data was only available for age-0 sharks in the MK). Annual absolute growth for age-0 sharks was much greater in the MK than in either the North Sound (NS) and Shark Land (SL) at Bimini. Growth of SL sharks was significantly faster during the first 2 years of life than of the sharks in the NS population. However, in MK, only growth in the first year was considered to be reliably estimated due to low recapture rates. Analyses indicated no significant differences in growth rates between males and females for any area.
Resumo:
In vitro measurements of skin absorption are an increasingly important aspect of regulatory studies, product support claims, and formulation screening. However, such measurements are significantly affected by skin variability. The purpose of this study was to determine inter- and intralaboratory variation in diffusion cell measurements caused by factors other than skin. This was attained through the use of an artificial (silicone rubber) rate-limiting membrane and the provision of materials including a standard penetrant, methyl paraben (MP), and a minimally prescriptive protocol to each of the 18 participating laboratories. Standardized calculations of MP flux were determined from the data submitted by each laboratory by applying a predefined mathematical model. This was deemed necessary to eliminate any interlaboratory variation caused by different methods of flux calculations. Average fluxes of MP calculated and reported by each laboratory (60 +/- 27 mug cm(-2) h(-1), n = 25, range 27-101) were in agreement with the standardized calculations of MP flux (60 +/- 21 mug cm(-2) h(-1), range 19-120). The coefficient of variation between laboratories was approximately 35% and was manifest as a fourfold difference between the lowest and highest average flux values and a sixfold difference between the lowest and highest individual flux values. Intra-laboratory variation was lower, averaging 10% for five individuals using the same equipment within a single laboratory. Further studies should be performed to clarify the exact components responsible for nonskin-related variability in diffusion cell measurements. It is clear that further developments of in vitro methodologies for measuring skin absorption are required. (C) 2005 Wiley-Liss, Inc.
Resumo:
Geographic variation in vocalizations is widespread in passerine birds, but its origins and maintenance remain unclear. One hypothesis to explain this variation is that it is associated with geographic isolation among populations and therefore should follow a vicariant pattern similar to that typically found in neutral genetic markers. Alternatively, if environmental selection strongly influences vocalizations, then genetic divergence and vocal divergence may be disassociated. This study compared genetic divergence derived from 11 microsatellite markers with a metric of phenotypic divergence derived from male bower advertisement calls. Data were obtained from 16 populations throughout the entire distribution of the satin bowerbird, an Australian wet-forest-restricted passerine. There was no relationship between call divergence and genetic divergence, similar to most other studies on birds with learned vocalizations. Genetic divergence followed a vicariant model of evolution, with the differentiation of isolated populations and isolation-by-distance among continuous populations. Previous work on Ptilonorhynchus violaceus has shown that advertisement call structure is strongly influenced by the acoustic environment of different habitats. Divergence in vocalizations among genetically related populations in different habitats indicates that satin bowerbirds match their vocalizations to the environment in which they live, despite the homogenizing influence of gene flow. In combination with convergence of vocalizations among genetically divergent populations occurring in the same habitat, this shows the overriding importance that habitat-related selection can have on the establishment and maintenance of variation in vocalizations.
Resumo:
The scale at which algal biodiversity is partitioned across the landscape, and the biophysical processes and biotic interactions which shape these communities in dryland river refugia was studied on two occasions from 30 sites in two Australian dryland rivers. Despite the waterholes studied having characteristically high levels of abiogenic turbidity, a total of 186 planktonic microalgae, 253 benthic diatom and 62 macroalgal species were recorded. The phytoplankton communities were dominated by flagellated cryptophytes, euglenophytes and chlorophytes, the diatom communities by cosmopolitan taxa known to tolerate wide environmental conditions, and the macroalgal communities by filamentous cyanobacteria. All algal communities showed significant differences between catchments and sampling times, with a suite of between 5 and 12 taxa responsible for similar to 50% of the observed change. In general, algal assemblage patterns were poorly correlated with the measured environmental variables. Phytoplankton and diatom assemblage patterns were weakly correlated with several waterhole geomorphic measures, whereas macroalgal assemblage patterns showed some association with variability in ionic concentration.