965 resultados para Approximat Model (scheme)


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Generalized linear mixed models with semiparametric random effects are useful in a wide variety of Bayesian applications. When the random effects arise from a mixture of Dirichlet process (MDP) model, normal base measures and Gibbs sampling procedures based on the Pólya urn scheme are often used to simulate posterior draws. These algorithms are applicable in the conjugate case when (for a normal base measure) the likelihood is normal. In the non-conjugate case, the algorithms proposed by MacEachern and Müller (1998) and Neal (2000) are often applied to generate posterior samples. Some common problems associated with simulation algorithms for non-conjugate MDP models include convergence and mixing difficulties. This paper proposes an algorithm based on the Pólya urn scheme that extends the Gibbs sampling algorithms to non-conjugate models with normal base measures and exponential family likelihoods. The algorithm proceeds by making Laplace approximations to the likelihood function, thereby reducing the procedure to that of conjugate normal MDP models. To ensure the validity of the stationary distribution in the non-conjugate case, the proposals are accepted or rejected by a Metropolis-Hastings step. In the special case where the data are normally distributed, the algorithm is identical to the Gibbs sampler.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The selective catalytic reduction system is a well established technology for NOx emissions control in diesel engines. A one dimensional, single channel selective catalytic reduction (SCR) model was previously developed using Oak Ridge National Laboratory (ORNL) generated reactor data for an iron-zeolite catalyst system. Calibration of this model to fit the experimental reactor data collected at ORNL for a copper-zeolite SCR catalyst is presented. Initially a test protocol was developed in order to investigate the different phenomena responsible for the SCR system response. A SCR model with two distinct types of storage sites was used. The calibration process was started with storage capacity calculations for the catalyst sample. Then the chemical kinetics occurring at each segment of the protocol was investigated. The reactions included in this model were adsorption, desorption, standard SCR, fast SCR, slow SCR, NH3 Oxidation, NO oxidation and N2O formation. The reaction rates were identified for each temperature using a time domain optimization approach. Assuming an Arrhenius form of the reaction rates, activation energies and pre-exponential parameters were fit to the reaction rates. The results indicate that the Arrhenius form is appropriate and the reaction scheme used allows the model to fit to the experimental data and also for use in real world engine studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

File system security is fundamental to the security of UNIX and Linux systems since in these systems almost everything is in the form of a file. To protect the system files and other sensitive user files from unauthorized accesses, certain security schemes are chosen and used by different organizations in their computer systems. A file system security model provides a formal description of a protection system. Each security model is associated with specified security policies which focus on one or more of the security principles: confidentiality, integrity and availability. The security policy is not only about “who” can access an object, but also about “how” a subject can access an object. To enforce the security policies, each access request is checked against the specified policies to decide whether it is allowed or rejected. The current protection schemes in UNIX/Linux systems focus on the access control. Besides the basic access control scheme of the system itself, which includes permission bits, setuid and seteuid mechanism and the root, there are other protection models, such as Capabilities, Domain Type Enforcement (DTE) and Role-Based Access Control (RBAC), supported and used in certain organizations. These models protect the confidentiality of the data directly. The integrity of the data is protected indirectly by only allowing trusted users to operate on the objects. The access control decisions of these models depend on either the identity of the user or the attributes of the process the user can execute, and the attributes of the objects. Adoption of these sophisticated models has been slow; this is likely due to the enormous complexity of specifying controls over a large file system and the need for system administrators to learn a new paradigm for file protection. We propose a new security model: file system firewall. It is an adoption of the familiar network firewall protection model, used to control the data that flows between networked computers, toward file system protection. This model can support decisions of access control based on any system generated attributes about the access requests, e.g., time of day. The access control decisions are not on one entity, such as the account in traditional discretionary access control or the domain name in DTE. In file system firewall, the access decisions are made upon situations on multiple entities. A situation is programmable with predicates on the attributes of subject, object and the system. File system firewall specifies the appropriate actions on these situations. We implemented the prototype of file system firewall on SUSE Linux. Preliminary results of performance tests on the prototype indicate that the runtime overhead is acceptable. We compared file system firewall with TE in SELinux to show that firewall model can accommodate many other access control models. Finally, we show the ease of use of firewall model. When firewall system is restricted to specified part of the system, all the other resources are not affected. This enables a relatively smooth adoption. This fact and that it is a familiar model to system administrators will facilitate adoption and correct use. The user study we conducted on traditional UNIX access control, SELinux and file system firewall confirmed that. The beginner users found it easier to use and faster to learn then traditional UNIX access control scheme and SELinux.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There has been a discontinuous but fairly persistent long-term decline in homicide rates in core European countries since about 1500. Since the 1950s, however, we observe an upward trend in violent crime not only in Europe but in almost all of the economically advanced nations that combine democratic political structures with free-market economies. The paper presents an explanatory scheme designed to account for both, the long decline and its apparent reversal. The theoretical model draws heavily upon ideas taken from the sociological work of Emile Durkheim and Norbert Elias—with some modifications and extensions. It seeks to integrate sociological and historical perspectives and to give due weight to both, structural and developmental forces. A key hypothesis is that the pacifying effects of the erosion of traditional collectivism can only be maintained to the extent by which “cooperative individualism” dominates over against the forces of “disintegrative individualism.” Some suggestions are made concerning the selection of appropriate indicators and the handling of methodological problems related to causal attribution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new anisotropic elastic-viscoplastic damage constitutive model for bone is proposed using an eccentric elliptical yield criterion and nonlinear isotropic hardening. A micromechanics-based multiscale homogenization scheme proposed by Reisinger et al. is used to obtain the effective elastic properties of lamellar bone. The dissipative process in bone is modeled as viscoplastic deformation coupled to damage. The model is based on an orthotropic ecuntric elliptical criterion in stress space. In order to simplify material identification, an eccentric elliptical isotropic yield surface was defined in strain space, which is transformed to a stress-based criterion by means of the damaged compliance tensor. Viscoplasticity is implemented by means of the continuous Perzyna formulation. Damage is modeled by a scalar function of the accumulated plastic strain D(κ) , reducing all element s of the stiffness matrix. A polynomial flow rule is proposed in order to capture the rate-dependent post-yield behavior of lamellar bone. A numerical algorithm to perform the back projection on the rate-dependent yield surface has been developed and implemented in the commercial finite element solver Abaqus/Standard as a user subroutine UMAT. A consistent tangent operator has been derived and implemented in order to ensure quadratic convergence. Correct implementation of the algorithm, convergence, and accuracy of the tangent operator was tested by means of strain- and stress-based single element tests. A finite element simulation of nano- indentation in lamellar bone was finally performed in order to show the abilities of the newly developed constitutive model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the course of this study, stiffness of a fibril array of mineralized collagen fibrils modeled with a mean field method was validated experimentally at site-matched two levels of tissue hierarchy using mineralized turkey leg tendons (MTLT). The applied modeling approaches allowed to model the properties of this unidirectional tissue from nanoscale (mineralized collagen fibrils) to macroscale (mineralized tendon). At the microlevel, the indentation moduli obtained with a mean field homogenization scheme were compared to the experimental ones obtained with microindentation. At the macrolevel, the macroscopic stiffness predicted with micro finite element (μFE) models was compared to the experimental stiffness measured with uniaxial tensile tests. Elastic properties of the elements in μFE models were injected from the mean field model or two-directional microindentations. Quantitatively, the indentation moduli can be properly predicted with the mean-field models. Local stiffness trends within specific tissue morphologies are very weak, suggesting additional factors responsible for the stiffness variations. At macrolevel, the μFE models underestimate the macroscopic stiffness, as compared to tensile tests, but the correlations are strong.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study examines how different microphysical parameterization schemes influence orographically induced precipitation and the distributions of hydrometeors and water vapour for midlatitude summer conditions in the Weather Research and Forecasting (WRF) model. A high-resolution two-dimensional idealized simulation is used to assess the differences between the schemes in which a moist air flow is interacting with a bell-shaped 2 km high mountain. Periodic lateral boundary conditions are chosen to recirculate atmospheric water in the domain. It is found that the 13 selected microphysical schemes conserve the water in the model domain. The gain or loss of water is less than 0.81% over a simulation time interval of 61 days. The differences of the microphysical schemes in terms of the distributions of water vapour, hydrometeors and accumulated precipitation are presented and discussed. The Kessler scheme, the only scheme without ice-phase processes, shows final values of cloud liquid water 14 times greater than the other schemes. The differences among the other schemes are not as extreme, but still they differ up to 79% in water vapour, up to 10 times in hydrometeors and up to 64% in accumulated precipitation at the end of the simulation. The microphysical schemes also differ in the surface evaporation rate. The WRF single-moment 3-class scheme has the highest surface evaporation rate compensated by the highest precipitation rate. The different distributions of hydrometeors and water vapour of the microphysical schemes induce differences up to 49 W m−2 in the downwelling shortwave radiation and up to 33 W m−2 in the downwelling longwave radiation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Patients suffering from cystic fibrosis (CF) show thick secretions, mucus plugging and bronchiectasis in bronchial and alveolar ducts. This results in substantial structural changes of the airway morphology and heterogeneous ventilation. Disease progression and treatment effects are monitored by so-called gas washout tests, where the change in concentration of an inert gas is measured over a single or multiple breaths. The result of the tests based on the profile of the measured concentration is a marker for the severity of the ventilation inhomogeneity strongly affected by the airway morphology. However, it is hard to localize underlying obstructions to specific parts of the airways, especially if occurring in the lung periphery. In order to support the analysis of lung function tests (e.g. multi-breath washout), we developed a numerical model of the entire airway tree, coupling a lumped parameter model for the lung ventilation with a 4th-order accurate finite difference model of a 1D advection-diffusion equation for the transport of an inert gas. The boundary conditions for the flow problem comprise the pressure and flow profile at the mouth, which is typically known from clinical washout tests. The natural asymmetry of the lung morphology is approximated by a generic, fractal, asymmetric branching scheme which we applied for the conducting airways. A conducting airway ends when its dimension falls below a predefined limit. A model acinus is then connected to each terminal airway. The morphology of an acinus unit comprises a network of expandable cells. A regional, linear constitutive law describes the pressure-volume relation between the pleural gap and the acinus. The cyclic expansion (breathing) of each acinus unit depends on the resistance of the feeding airway and on the flow resistance and stiffness of the cells themselves. Special care was taken in the development of a conservative numerical scheme for the gas transport across bifurcations, handling spatially and temporally varying advective and diffusive fluxes over a wide range of scales. Implicit time integration was applied to account for the numerical stiffness resulting from the discretized transport equation. Local or regional modification of the airway dimension, resistance or tissue stiffness are introduced to mimic pathological airway restrictions typical for CF. This leads to a more heterogeneous ventilation of the model lung. As a result the concentration in some distal parts of the lung model remains increased for a longer duration. The inert gas concentration at the mouth towards the end of the expirations is composed of gas from regions with very different washout efficiency. This results in a steeper slope of the corresponding part of the washout profile.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Simulating surface wind over complex terrain is a challenge in regional climate modelling. Therefore, this study aims at identifying a set-up of the Weather Research and Forecasting Model (WRF) model that minimises system- atic errors of surface winds in hindcast simulations. Major factors of the model configuration are tested to find a suitable set-up: the horizontal resolution, the planetary boundary layer (PBL) parameterisation scheme and the way the WRF is nested to the driving data set. Hence, a number of sensitivity simulations at a spatial resolution of 2 km are carried out and compared to observations. Given the importance of wind storms, the analysis is based on case studies of 24 historical wind storms that caused great economic damage in Switzerland. Each of these events is downscaled using eight different model set-ups, but sharing the same driving data set. The results show that the lack of representation of the unresolved topography leads to a general overestimation of wind speed in WRF. However, this bias can be substantially reduced by using a PBL scheme that explicitly considers the effects of non-resolved topography, which also improves the spatial structure of wind speed over Switzerland. The wind direction, although generally well reproduced, is not very sensitive to the PBL scheme. Further sensitivity tests include four types of nesting methods: nesting only at the boundaries of the outermost domain, analysis nudging, spectral nudging, and the so-called re-forecast method, where the simulation is frequently restarted. These simulations show that restricting the freedom of the model to develop large-scale disturbances slightly increases the temporal agreement with the observations, at the same time that it further reduces the overestimation of wind speed, especially for maximum wind peaks. The model performance is also evaluated in the outermost domains, where the resolution is coarser. The results demonstrate the important role of horizontal resolution, where the step from 6 to 2 km significantly improves model performance. In summary, the combination of a grid size of 2 km, the non-local PBL scheme modified to explicitly account for non-resolved orography, as well as analysis or spectral nudging, is a superior combination when dynamical downscaling is aimed at reproducing real wind fields.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Calving is a major mechanism of ice discharge of the Antarctic and Greenland ice sheets, and a change in calving front position affects the entire stress regime of marine terminating glaciers. The representation of calving front dynamics in a 2-D or 3-D ice sheet model remains non-trivial. Here, we present the theoretical and technical framework for a level-set method, an implicit boundary tracking scheme, which we implement into the Ice Sheet System Model (ISSM). This scheme allows us to study the dynamic response of a drainage basin to user-defined calving rates. We apply the method to Jakobshavn Isbræ, a major marine terminating outlet glacier of the West Greenland Ice Sheet. The model robustly reproduces the high sensitivity of the glacier to calving, and we find that enhanced calving triggers significant acceleration of the ice stream. Upstream acceleration is sustained through a combination of mechanisms. However, both lateral stress and ice influx stabilize the ice stream. This study provides new insights into the ongoing changes occurring at Jakobshavn Isbræ and emphasizes that the incorporation of moving boundaries and dynamic lateral effects, not captured in flow-line models, is key for realistic model projections of sea level rise on centennial timescales.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is commonly understood that the observed decline in precipitation in South-West Australia during the 20th century is caused by anthropogenic factors. Candidates therefore are changes to large-scale atmospheric circulations due to global warming, extensive deforestation and anthropogenic aerosol emissions - all of which are effective on different spatial and temporal scales. This contribution focusses on the role of rapidly rising aerosol emissions from anthropogenic sources in South-West Australia around 1970. An analysis of historical longterm rainfall data of the Bureau of Meteorology shows that South-West Australia as a whole experienced a gradual decline in precipitation over the 20th century. However, on smaller scales and for the particular example of the Perth catchment area, a sudden drop in precipitation around 1970 is apparent. Modelling experiments at a convection-resolving resolution of 3.3km using the Weather and Research Forecasting (WRF) model version 3.6.1 with the aerosol-aware Thompson-Eidhammer microphysics scheme are conducted for the period 1970-1974. A comparison of four runs with different prescribed aerosol emissions and without aerosol effects demonstrates that tripling the pre-1960s atmospheric CCN and IN concentrations can suppress precipitation by 2-9%, depending on the area and the season. This suggests that a combination of all three processes is required to account for the gradual decline in rainfall seen for greater South-West Australia and for the sudden drop observed in areas along the West Coast in the 1970s: changing atmospheric circulations, deforestation and anthropogenic aerosol emissions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Proof-Carrying Code (PCC) is a general approach to mobile code safety in which programs are augmented with a certificate (or proof). The intended benefit is that the program consumer can locally validate the certificate w.r.t. the "untrustcd" program by means of a certificate checker a process which should be much simpler, efficient, and automatic than generating the original proof. The practical uptake of PCC greatly depends on the existence of a variety of enabling technologies which allow both proving programs correct and replacing a costly verification process by an efficient checking proceduri on th( consumer side. In this work we propose Abstraction- Carrying Code (ACC), a novel approach which uses abstract interpretation as enabling technology. We argue that the large body of applications of abstract interpretation to program verification is amenable to the overall PCC scheme. In particular, we rely on an expressive class of safely policies which can be defined over different abstract domains. We use an abstraction (or abstract model) of the program computed by standard static analyzers as a certificate. The validity of the abstraction on ihe consumer side is checked in a single pass by a very efficient and specialized abstract-interpreter. We believe that ACC brings the expressiveness, flexibility and automation which is inherent in abstract interpretation techniques to the area of mobile code safety.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web 1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs. These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools. Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate. However, linguistic annotation tools have still some limitations, which can be summarised as follows: 1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.). 2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts. 3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc. A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved. In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool. Therefore, it would be quite useful to find a way to (i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools; (ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate. Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned. Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section. 2. GOALS OF THE PRESENT WORK As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based triples, as in the usual Semantic Web languages (namely RDF(S) and OWL), in order for the model to be considered suitable for the Semantic Web. Besides, to be useful for the Semantic Web, this model should provide a way to automate the annotation of web pages. As for the present work, this requirement involved reusing the linguistic annotation tools purchased by the OEG research group (http://www.oeg-upm.net), but solving beforehand (or, at least, minimising) some of their limitations. Therefore, this model had to minimise these limitations by means of the integration of several linguistic annotation tools into a common architecture. Since this integration required the interoperation of tools and their annotations, ontologies were proposed as the main technological component to make them effectively interoperate. From the very beginning, it seemed that the formalisation of the elements and the knowledge underlying linguistic annotations within an appropriate set of ontologies would be a great step forward towards the formulation of such a model (henceforth referred to as OntoTag). Obviously, first, to combine the results of the linguistic annotation tools that operated at the same level, their annotation schemas had to be unified (or, preferably, standardised) in advance. This entailed the unification (id. standardisation) of their tags (both their representation and their meaning), and their format or syntax. Second, to merge the results of the linguistic annotation tools operating at different levels, their respective annotation schemas had to be (a) made interoperable and (b) integrated. And third, in order for the resulting annotations to suit the Semantic Web, they had to be specified by means of an ontology-based vocabulary, and structured by means of ontology-based triples, as hinted above. Therefore, a new annotation scheme had to be devised, based both on ontologies and on this type of triples, which allowed for the combination and the integration of the annotations of any set of linguistic annotation tools. This annotation scheme was considered a fundamental part of the model proposed here, and its development was, accordingly, another major objective of the present work. All these goals, aims and objectives could be re-stated more clearly as follows: Goal 1: Development of a set of ontologies for the formalisation of the linguistic knowledge relating linguistic annotation. Sub-goal 1.1: Ontological formalisation of the EAGLES (1996a; 1996b) de facto standards for morphosyntactic and syntactic annotation, in a way that helps respect the triple structure recommended for annotations in these works (which is isomorphic to the triple structures used in the context of the Semantic Web). Sub-goal 1.2: Incorporation into this preliminary ontological formalisation of other existing standards and standard proposals relating the levels mentioned above, such as those currently under development within ISO/TC 37 (the ISO Technical Committee dealing with Terminology, which deals also with linguistic resources and annotations). Sub-goal 1.3: Generalisation and extension of the recommendations in EAGLES (1996a; 1996b) and ISO/TC 37 to the semantic level, for which no ISO/TC 37 standards have been developed yet. Sub-goal 1.4: Ontological formalisation of the generalisations and/or extensions obtained in the previous sub-goal as generalisations and/or extensions of the corresponding ontology (or ontologies). Sub-goal 1.5: Ontological formalisation of the knowledge required to link, combine and unite the knowledge represented in the previously developed ontology (or ontologies). Goal 2: Development of OntoTag’s annotation scheme, a standard-based abstract scheme for the hybrid (linguistically-motivated and ontological-based) annotation of texts. Sub-goal 2.1: Development of the standard-based morphosyntactic annotation level of OntoTag’s scheme. This level should include, and possibly extend, the recommendations of EAGLES (1996a) and also the recommendations included in the ISO/MAF (2008) standard draft. Sub-goal 2.2: Development of the standard-based syntactic annotation level of the hybrid abstract scheme. This level should include, and possibly extend, the recommendations of EAGLES (1996b) and the ISO/SynAF (2010) standard draft. Sub-goal 2.3: Development of the standard-based semantic annotation level of OntoTag’s (abstract) scheme. Sub-goal 2.4: Development of the mechanisms for a convenient integration of the three annotation levels already mentioned. These mechanisms should take into account the recommendations included in the ISO/LAF (2009) standard draft. Goal 3: Design of OntoTag’s (abstract) annotation architecture, an abstract architecture for the hybrid (semantic) annotation of texts (i) that facilitates the integration and interoperation of different linguistic annotation tools, and (ii) whose results comply with OntoTag’s annotation scheme. Sub-goal 3.1: Specification of the decanting processes that allow for the classification and separation, according to their corresponding levels, of the results of the linguistic tools annotating at several different levels. Sub-goal 3.2: Specification of the standardisation processes that allow (a) complying with the standardisation requirements of OntoTag’s annotation scheme, as well as (b) combining the results of those linguistic tools that share some level of annotation. Sub-goal 3.3: Specification of the merging processes that allow for the combination of the output annotations and the interoperation of those linguistic tools that share some level of annotation. Sub-goal 3.4: Specification of the merge processes that allow for the integration of the results and the interoperation of those tools performing their annotations at different levels. Goal 4: Generation of OntoTagger’s schema, a concrete instance of OntoTag’s abstract scheme for a concrete set of linguistic annotations. These linguistic annotations result from the tools and the resources available in the research group, namely • Bitext’s DataLexica (http://www.bitext.com/EN/datalexica.asp), • LACELL’s (POS) tagger (http://www.um.es/grupos/grupo-lacell/quees.php), • Connexor’s FDG (http://www.connexor.eu/technology/machinese/glossary/fdg/), and • EuroWordNet (Vossen et al., 1998). This schema should help evaluate OntoTag’s underlying hypotheses, stated below. Consequently, it should implement, at least, those levels of the abstract scheme dealing with the annotations of the set of tools considered in this implementation. This includes the morphosyntactic, the syntactic and the semantic levels. Goal 5: Implementation of OntoTagger’s configuration, a concrete instance of OntoTag’s abstract architecture for this set of linguistic tools and annotations. This configuration (1) had to use the schema generated in the previous goal; and (2) should help support or refute the hypotheses of this work as well (see the next section). Sub-goal 5.1: Implementation of the decanting processes that facilitate the classification and separation of the results of those linguistic resources that provide annotations at several different levels (on the one hand, LACELL’s tagger operates at the morphosyntactic level and, minimally, also at the semantic level; on the other hand, FDG operates at the morphosyntactic and the syntactic levels and, minimally, at the semantic level as well). Sub-goal 5.2: Implementation of the standardisation processes that allow (i) specifying the results of those linguistic tools that share some level of annotation according to the requirements of OntoTagger’s schema, as well as (ii) combining these shared level results. In particular, all the tools selected perform morphosyntactic annotations and they had to be conveniently combined by means of these processes. Sub-goal 5.3: Implementation of the merging processes that allow for the combination (and possibly the improvement) of the annotations and the interoperation of the tools that share some level of annotation (in particular, those relating the morphosyntactic level, as in the previous sub-goal). Sub-goal 5.4: Implementation of the merging processes that allow for the integration of the different standardised and combined annotations aforementioned, relating all the levels considered. Sub-goal 5.5: Improvement of the semantic level of this configuration by adding a named entity recognition, (sub-)classification and annotation subsystem, which also uses the named entities annotated to populate a domain ontology, in order to provide a concrete application of the present work in the two areas involved (the Semantic Web and Corpus Linguistics). 3. MAIN RESULTS: ASSESSMENT OF ONTOTAG’S UNDERLYING HYPOTHESES The model developed in the present thesis tries to shed some light on (i) whether linguistic annotation tools can effectively interoperate; (ii) whether their results can be combined and integrated; and, if they can, (iii) how they can, respectively, interoperate and be combined and integrated. Accordingly, several hypotheses had to be supported (or rejected) by the development of the OntoTag model and OntoTagger (its implementation). The hypotheses underlying OntoTag are surveyed below. Only one of the hypotheses (H.6) was rejected; the other five could be confirmed. H.1 The annotations of different levels (or layers) can be integrated into a sort of overall, comprehensive, multilayer and multilevel annotation, so that their elements can complement and refer to each other. • CONFIRMED by the development of: o OntoTag’s annotation scheme, o OntoTag’s annotation architecture, o OntoTagger’s (XML, RDF, OWL) annotation schemas, o OntoTagger’s configuration. H.2 Tool-dependent annotations can be mapped onto a sort of tool-independent annotations and, thus, can be standardised. • CONFIRMED by means of the standardisation phase incorporated into OntoTag and OntoTagger for the annotations yielded by the tools. H.3 Standardisation should ease: H.3.1: The interoperation of linguistic tools. H.3.2: The comparison, combination (at the same level and layer) and integration (at different levels or layers) of annotations. • H.3 was CONFIRMED by means of the development of OntoTagger’s ontology-based configuration: o Interoperation, comparison, combination and integration of the annotations of three different linguistic tools (Connexor’s FDG, Bitext’s DataLexica and LACELL’s tagger); o Integration of EuroWordNet-based, domain-ontology-based and named entity annotations at the semantic level. o Integration of morphosyntactic, syntactic and semantic annotations. H.4 Ontologies and Semantic Web technologies (can) play a crucial role in the standardisation of linguistic annotations, by providing consensual vocabularies and standardised formats for annotation (e.g., RDF triples). • CONFIRMED by means of the development of OntoTagger’s RDF-triple-based annotation schemas. H.5 The rate of errors introduced by a linguistic tool at a given level, when annotating, can be reduced automatically by contrasting and combining its results with the ones coming from other tools, operating at the same level. However, these other tools might be built following a different technological (stochastic vs. rule-based, for example) or theoretical (dependency vs. HPS-grammar-based, for instance) approach. • CONFIRMED by the results yielded by the evaluation of OntoTagger. H.6 Each linguistic level can be managed and annotated independently. • REJECTED: OntoTagger’s experiments and the dependencies observed among the morphosyntactic annotations, and between them and the syntactic annotations. In fact, Hypothesis H.6 was already rejected when OntoTag’s ontologies were developed. We observed then that several linguistic units stand on an interface between levels, belonging thereby to both of them (such as morphosyntactic units, which belong to both the morphological level and the syntactic level). Therefore, the annotations of these levels overlap and cannot be handled independently when merged into a unique multileveled annotation. 4. OTHER MAIN RESULTS AND CONTRIBUTIONS First, interoperability is a hot topic for both the linguistic annotation community and the whole Computer Science field. The specification (and implementation) of OntoTag’s architecture for the combination and integration of linguistic (annotation) tools and annotations by means of ontologies shows a way to make these different linguistic annotation tools and annotations interoperate in practice. Second, as mentioned above, the elements involved in linguistic annotation were formalised in a set (or network) of ontologies (OntoTag’s linguistic ontologies). • On the one hand, OntoTag’s network of ontologies consists of − The Linguistic Unit Ontology (LUO), which includes a mostly hierarchical formalisation of the different types of linguistic elements (i.e., units) identifiable in a written text; − The Linguistic Attribute Ontology (LAO), which includes also a mostly hierarchical formalisation of the different types of features that characterise the linguistic units included in the LUO; − The Linguistic Value Ontology (LVO), which includes the corresponding formalisation of the different values that the attributes in the LAO can take; − The OIO (OntoTag’s Integration Ontology), which  Includes the knowledge required to link, combine and unite the knowledge represented in the LUO, the LAO and the LVO;  Can be viewed as a knowledge representation ontology that describes the most elementary vocabulary used in the area of annotation. • On the other hand, OntoTag’s ontologies incorporate the knowledge included in the different standards and recommendations for linguistic annotation released so far, such as those developed within the EAGLES and the SIMPLE European projects or by the ISO/TC 37 committee: − As far as morphosyntactic annotations are concerned, OntoTag’s ontologies formalise the terms in the EAGLES (1996a) recommendations and their corresponding terms within the ISO Morphosyntactic Annotation Framework (ISO/MAF, 2008) standard; − As for syntactic annotations, OntoTag’s ontologies incorporate the terms in the EAGLES (1996b) recommendations and their corresponding terms within the ISO Syntactic Annotation Framework (ISO/SynAF, 2010) standard draft; − Regarding semantic annotations, OntoTag’s ontologies generalise and extend the recommendations in EAGLES (1996a; 1996b) and, since no stable standards or standard drafts have been released for semantic annotation by ISO/TC 37 yet, they incorporate the terms in SIMPLE (2000) instead; − The terms coming from all these recommendations and standards were supplemented by those within the ISO Data Category Registry (ISO/DCR, 2008) and also of the ISO Linguistic Annotation Framework (ISO/LAF, 2009) standard draft when developing OntoTag’s ontologies. Third, we showed that the combination of the results of tools annotating at the same level can yield better results (both in precision and in recall) than each tool separately. In particular, 1. OntoTagger clearly outperformed two of the tools integrated into its configuration, namely DataLexica and FDG in all the combination sub-phases in which they overlapped (i.e. POS tagging, lemma annotation and morphological feature annotation). As far as the remaining tool is concerned, i.e. LACELL’s tagger, it was also outperformed by OntoTagger in POS tagging and lemma annotation, and it did not behave better than OntoTagger in the morphological feature annotation layer. 2. As an immediate result, this implies that a) This type of combination architecture configurations can be applied in order to improve significantly the accuracy of linguistic annotations; and b) Concerning the morphosyntactic level, this could be regarded as a way of constructing more robust and more accurate POS tagging systems. Fourth, Semantic Web annotations are usually performed by humans or else by machine learning systems. Both of them leave much to be desired: the former, with respect to their annotation rate; the latter, with respect to their (average) precision and recall. In this work, we showed how linguistic tools can be wrapped in order to annotate automatically Semantic Web pages using ontologies. This entails their fast, robust and accurate semantic annotation. As a way of example, as mentioned in Sub-goal 5.5, we developed a particular OntoTagger module for the recognition, classification and labelling of named entities, according to the MUC and ACE tagsets (Chinchor, 1997; Doddington et al., 2004). These tagsets were further specified by means of a domain ontology, namely the Cinema Named Entities Ontology (CNEO). This module was applied to the automatic annotation of ten different web pages containing cinema reviews (that is, around 5000 words). In addition, the named entities annotated with this module were also labelled as instances (or individuals) of the classes included in the CNEO and, then, were used to populate this domain ontology. • The statistical results obtained from the evaluation of this particular module of OntoTagger can be summarised as follows. On the one hand, as far as recall (R) is concerned, (R.1) the lowest value was 76,40% (for file 7); (R.2) the highest value was 97, 50% (for file 3); and (R.3) the average value was 88,73%. On the other hand, as far as the precision rate (P) is concerned, (P.1) its minimum was 93,75% (for file 4); (R.2) its maximum was 100% (for files 1, 5, 7, 8, 9, and 10); and (R.3) its average value was 98,99%. • These results, which apply to the tasks of named entity annotation and ontology population, are extraordinary good for both of them. They can be explained on the basis of the high accuracy of the annotations provided by OntoTagger at the lower levels (mainly at the morphosyntactic level). However, they should be conveniently qualified, since they might be too domain- and/or language-dependent. It should be further experimented how our approach works in a different domain or a different language, such as French, English, or German. • In any case, the results of this application of Human Language Technologies to Ontology Population (and, accordingly, to Ontological Engineering) seem very promising and encouraging in order for these two areas to collaborate and complement each other in the area of semantic annotation. Fifth, as shown in the State of the Art of this work, there are different approaches and models for the semantic annotation of texts, but all of them focus on a particular view of the semantic level. Clearly, all these approaches and models should be integrated in order to bear a coherent and joint semantic annotation level. OntoTag shows how (i) these semantic annotation layers could be integrated together; and (ii) they could be integrated with the annotations associated to other annotation levels. Sixth, we identified some recommendations, best practices and lessons learned for annotation standardisation, interoperation and merge. They show how standardisation (via ontologies, in this case) enables the combination, integration and interoperation of different linguistic tools and their annotations into a multilayered (or multileveled) linguistic annotation, which is one of the hot topics in the area of Linguistic Annotation. And last but not least, OntoTag’s annotation scheme and OntoTagger’s annotation schemas show a way to formalise and annotate coherently and uniformly the different units and features associated to the different levels and layers of linguistic annotation. This is a great scientific step ahead towards the global standardisation of this area, which is the aim of ISO/TC 37 (in particular, Subcommittee 4, dealing with the standardisation of linguistic annotations and resources).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we examine the issue of memory management in the parallel execution of logic programs. We concentrate on non-deterministic and-parallel schemes which we believe present a relatively general set of problems to be solved, including most of those encountered in the memory management of or-parallel systems. We present a distributed stack memory management model which allows flexible scheduling of goals. Previously proposed models (based on the "Marker model") are lacking in that they impose restrictions on the selection of goals to be executed or they may require consume a large amount of virtual memory. This paper first presents results which imply that the above mentioned shortcomings can have significant performance impacts. An extension of the Marker Model is then proposed which allows flexible scheduling of goals while keeping (virtual) memory consumption down. Measurements are presented which show the advantage of this solution. Methods for handling forward and backward execution, cut and roll back are discussed in the context of the proposed scheme. In addition, the paper shows how the same mechanism for flexible scheduling can be applied to allow the efficient handling of the very general form of suspension that can occur in systems which combine several types of and-parallelism and more sophisticated methods of executing logic programs. We believe that the results are applicable to many and- and or-parallel systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Opportunities offered by high performance computing provide a significant degree of promise in the enhancement of the performance of real-time flood forecasting systems. In this paper, a real-time framework for probabilistic flood forecasting through data assimilation is presented. The distributed rainfall-runoff real-time interactive basin simulator (RIBS) model is selected to simulate the hydrological process in the basin. Although the RIBS model is deterministic, it is run in a probabilistic way through the results of calibration developed in a previous work performed by the authors that identifies the probability distribution functions that best characterise the most relevant model parameters. Adaptive techniques improve the result of flood forecasts because the model can be adapted to observations in real time as new information is available. The new adaptive forecast model based on genetic programming as a data assimilation technique is compared with the previously developed flood forecast model based on the calibration results. Both models are probabilistic as they generate an ensemble of hydrographs, taking the different uncertainties inherent in any forecast process into account. The Manzanares River basin was selected as a case study, with the process being computationally intensive as it requires simulation of many replicas of the ensemble in real time.