926 resultados para Model-Data Integration and Data Assimilation
Resumo:
P>In livestock genetic resource conservation, decision making about conservation priorities is based on the simultaneous analysis of several different criteria that may contribute to long-term sustainable breeding conditions, such as genetic and demographic characteristics, environmental conditions, and role of the breed in the local or regional economy. Here we address methods to integrate different data sets and highlight problems related to interdisciplinary comparisons. Data integration is based on the use of geographic coordinates and Geographic Information Systems (GIS). In addition to technical problems related to projection systems, GIS have to face the challenging issue of the non homogeneous scale of their data sets. We give examples of the successful use of GIS for data integration and examine the risk of obtaining biased results when integrating datasets that have been captured at different scales.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
In the last few years the resolution of numerical weather prediction (nwp) became higher and higher with the progresses of technology and knowledge. As a consequence, a great number of initial data became fundamental for a correct initialization of the models. The potential of radar observations has long been recognized for improving the initial conditions of high-resolution nwp models, while operational application becomes more frequent. The fact that many nwp centres have recently taken into operations convection-permitting forecast models, many of which assimilate radar data, emphasizes the need for an approach to providing quality information which is needed in order to avoid that radar errors degrade the model's initial conditions and, therefore, its forecasts. Environmental risks can can be related with various causes: meteorological, seismical, hydrological/hydraulic. Flash floods have horizontal dimension of 1-20 Km and can be inserted in mesoscale gamma subscale, this scale can be modeled only with nwp model with the highest resolution as the COSMO-2 model. One of the problems of modeling extreme convective events is related with the atmospheric initial conditions, in fact the scale dimension for the assimilation of atmospheric condition in an high resolution model is about 10 Km, a value too high for a correct representation of convection initial conditions. Assimilation of radar data with his resolution of about of Km every 5 or 10 minutes can be a solution for this problem. In this contribution a pragmatic and empirical approach to deriving a radar data quality description is proposed to be used in radar data assimilation and more specifically for the latent heat nudging (lhn) scheme. Later the the nvective capabilities of the cosmo-2 model are investigated through some case studies. Finally, this work shows some preliminary experiments of coupling of a high resolution meteorological model with an Hydrological one.
Resumo:
Several countries have acquired, over the past decades, large amounts of area covering Airborne Electromagnetic data. Contribution of airborne geophysics has dramatically increased for both groundwater resource mapping and management proving how those systems are appropriate for large-scale and efficient groundwater surveying. We start with processing and inversion of two AEM dataset from two different systems collected over the Spiritwood Valley Aquifer area, Manitoba, Canada respectively, the AeroTEM III (commissioned by the Geological Survey of Canada in 2010) and the “Full waveform VTEM” dataset, collected and tested over the same survey area, during the fall 2011. We demonstrate that in the presence of multiple datasets, either AEM and ground data, due processing, inversion, post-processing, data integration and data calibration is the proper approach capable of providing reliable and consistent resistivity models. Our approach can be of interest to many end users, ranging from Geological Surveys, Universities to Private Companies, which are often proprietary of large geophysical databases to be interpreted for geological and\or hydrogeological purposes. In this study we deeply investigate the role of integration of several complimentary types of geophysical data collected over the same survey area. We show that data integration can improve inversions, reduce ambiguity and deliver high resolution results. We further attempt to use the final, most reliable output resistivity models as a solid basis for building a knowledge-driven 3D geological voxel-based model. A voxel approach allows a quantitative understanding of the hydrogeological setting of the area, and it can be further used to estimate the aquifers volumes (i.e. potential amount of groundwater resources) as well as hydrogeological flow model prediction. In addition, we investigated the impact of an AEM dataset towards hydrogeological mapping and 3D hydrogeological modeling, comparing it to having only a ground based TEM dataset and\or to having only boreholes data.
Resumo:
Intensity modulated radiation therapy (IMRT) is a technique that delivers a highly conformal dose distribution to a target volume while attempting to maximally spare the surrounding normal tissues. IMRT is a common treatment modality used for treating head and neck (H&N) cancers, and the presence of many critical structures in this region requires accurate treatment delivery. The Radiological Physics Center (RPC) acts as both a remote and on-site quality assurance agency that credentials institutions participating in clinical trials. To date, about 30% of all IMRT participants have failed the RPC’s remote audit using the IMRT H&N phantom. The purpose of this project is to evaluate possible causes of H&N IMRT delivery errors observed by the RPC, specifically IMRT treatment plan complexity and the use of improper dosimetry data from machines that were thought to be matched but in reality were not. Eight H&N IMRT plans with a range of complexity defined by total MU (1460-3466), number of segments (54-225), and modulation complexity scores (MCS) (0.181-0.609) were created in Pinnacle v.8m. These plans were delivered to the RPC’s H&N phantom on a single Varian Clinac. One of the IMRT plans (1851 MU, 88 segments, and MCS=0.469) was equivalent to the median H&N plan from 130 previous RPC H&N phantom irradiations. This average IMRT plan was also delivered on four matched Varian Clinac machines and the dose distribution calculated using a different 6MV beam model. Radiochromic film and TLD within the phantom were used to analyze the dose profiles and absolute doses, respectively. The measured and calculated were compared to evaluate the dosimetric accuracy. All deliveries met the RPC acceptance criteria of ±7% absolute dose difference and 4 mm distance-to-agreement (DTA). Additionally, gamma index analysis was performed for all deliveries using a ±7%/4mm and ±5%/3mm criteria. Increasing the treatment plan complexity by varying the MU, number of segments, or varying the MCS resulted in no clear trend toward an increase in dosimetric error determined by the absolute dose difference, DTA, or gamma index. Varying the delivery machines as well as the beam model (use of a Clinac 6EX 6MV beam model vs. Clinac 21EX 6MV model), also did not show any clear trend towards an increased dosimetric error using the same criteria indicated above.
Resumo:
The joint modeling of longitudinal and survival data is a new approach to many applications such as HIV, cancer vaccine trials and quality of life studies. There are recent developments of the methodologies with respect to each of the components of the joint model as well as statistical processes that link them together. Among these, second order polynomial random effect models and linear mixed effects models are the most commonly used for the longitudinal trajectory function. In this study, we first relax the parametric constraints for polynomial random effect models by using Dirichlet process priors, then three longitudinal markers rather than only one marker are considered in one joint model. Second, we use a linear mixed effect model for the longitudinal process in a joint model analyzing the three markers. In this research these methods were applied to the Primary Biliary Cirrhosis sequential data, which were collected from a clinical trial of primary biliary cirrhosis (PBC) of the liver. This trial was conducted between 1974 and 1984 at the Mayo Clinic. The effects of three longitudinal markers (1) Total Serum Bilirubin, (2) Serum Albumin and (3) Serum Glutamic-Oxaloacetic transaminase (SGOT) on patients' survival were investigated. Proportion of treatment effect will also be studied using the proposed joint modeling approaches. ^ Based on the results, we conclude that the proposed modeling approaches yield better fit to the data and give less biased parameter estimates for these trajectory functions than previous methods. Model fit is also improved after considering three longitudinal markers instead of one marker only. The results from analysis of proportion of treatment effects from these joint models indicate same conclusion as that from the final model of Fleming and Harrington (1991), which is Bilirubin and Albumin together has stronger impact in predicting patients' survival and as a surrogate endpoints for treatment. ^
Resumo:
Objective. To determine the association between nativity status and mammography utilization among women in the U.S. and assess whether demographic variables, socioeconomic factors healthcare access, breast cancer risk factors and acculturation variables were predictors in the relationship between nativity status and mammography in the past two years. ^ Methods. The NHIS collects demographic and health information using face-to-face interviews among a representative sample of the U.S. population and a cancer control module assessing screening behaviors is included every five years. Descriptive statistics were used to report demographic characteristics of women aged 40 and older who have received a mammogram in the last 2 years from 2000 and 2005. We used chi square analyses to determine statistically significant differences by mammography screening for each covariate. Logistic regression was used to determine whether demographic characteristics, socioeconomic characteristics, healthcare access, breast cancer risk factors and acculturation variables among foreign-born Hispanics affected the relationship between nativity status and mammography use in the past 2 years. ^ Results. In 2000, the crude model between nativity and mammography was significant but results were not significant after adjusting for health insurance, access and reported health status. Significant results were also reported for years in U.S. and mammography among foreign-born born women. In 2005, the crude model was also significant but results were not significant after adjusting for demographic factors. Furthermore, there was a significant finding between citizenship and mammography in the past 2 years. ^ Conclusions. Our study contributes to the literature as one of the first national-based studies assessing mammography in the past two years based on nativity status. Based on our findings, health insurance and access to care is an important predictor in mammography utilization among foreign-born women. For those with health care access, physician recommendation should further be assessed to determine whether women are made aware of mammography as a means to detect breast cancer at an early stage and further reduce the risk of mortality from the breast cancer.^
Resumo:
The copepod Calanus finmarchicus is the dominant species of the meso-zooplankton in the Norwegian Sea, and constitutes an important link between the phytoplankton and the higher trophic levels in the Norwegian Sea food chain. An individualbased model for C. finmarchicus, based on super-individuals and evolving traits for behaviour, stages, etc., is two-way coupled to the NORWegian ECOlogical Model system (NORWECOM). One year of modelled C. finmarchicus spatial distribution, production and biomass are found to represent observations reasonably well. High C. finmarchicus abundance is found along the Norwegian shelf-break in the early summer, while the overwintering population is found along the slope and in the deeper Norwegian Sea basins. The timing of the spring bloom is generally later than in the observations. Annual Norwegian Sea production is found to be 29 million tonnes of carbon and a production to biomass (P/B) ratio of 4.3 emerges. Sensitivity tests show that the modelling system is robust to initial values of behavioural traits and with regards to the number of super-individuals simulated given that this is above about 50,000 individuals. Experiments with the model system indicate that it provides a valuable tool for studies of ecosystem responses to causative forces such as prey density or overwintering population size. For example, introducing C. finmarchicus food limitations reduces the stock dramatically, but on the other hand, a reduced stock may rebuild in one year under normal conditions. The NetCDF file contains model grid coordinates and bottom topography.
Resumo:
The Southern Westerly Winds (SWW) exert a crucial influence over the world ocean and climate. Nevertheless, a comprehensive understanding of the Holocene temporal and spatial evolution of the SWW remains a significant challenge due to the sparsity of high-resolution marine archives and appropriate SWW proxies. Here, we present a north-south transect of high-resolution planktonic foraminiferal oxygen isotope records from the western South Atlantic. Our proxy records reveal Holocene migrations of the Brazil- Malvinas Confluence (BMC), a highly sensitive feature for changes in the position and strength of the northern portion of the SWW. Through the tight coupling of the BMC position to the large-scale wind field, the records allow a quantitative reconstruction of Holocene latitudinal displacements of the SWW across the South Atlantic. Our data reveal a gradual poleward movement of the SWW by about 1-1.5° from the early to the mid-Holocene. Afterwards variability in the SWW is dominated by millennial-scale displacements in the order of 1° in latitude with no recognizable longer-term trend. These findings are confronted with results from a state-of-the-art transient Holocene climate simulation using a comprehensive coupled atmosphere-ocean general circulation model. Proxy-inferred and modeled SWW shifts compare qualitatively, but the model underestimates both orbitally forced multi-millennial and internal millennial SWW variability by almost an order of magnitude. The underestimated natural variability implies a substantial uncertainty in model projections of future SWW shifts.
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
La mayoría de las aplicaciones forestales del escaneo laser aerotransportado (ALS, del inglés airborne laser scanning) requieren la integración y uso simultaneo de diversas fuentes de datos, con el propósito de conseguir diversos objetivos. Los proyectos basados en sensores remotos normalmente consisten en aumentar la escala de estudio progresivamente a lo largo de varias fases de fusión de datos: desde la información más detallada obtenida sobre un área limitada (la parcela de campo), hasta una respuesta general de la cubierta forestal detectada a distancia de forma más incierta pero cubriendo un área mucho más amplia (la extensión cubierta por el vuelo o el satélite). Todas las fuentes de datos necesitan en ultimo termino basarse en las tecnologías de sistemas de navegación global por satélite (GNSS, del inglés global navigation satellite systems), las cuales son especialmente erróneas al operar por debajo del dosel forestal. Otras etapas adicionales de procesamiento, como la ortorectificación, también pueden verse afectadas por la presencia de vegetación, deteriorando la exactitud de las coordenadas de referencia de las imágenes ópticas. Todos estos errores introducen ruido en los modelos, ya que los predictores se desplazan de la posición real donde se sitúa su variable respuesta. El grado por el que las estimaciones forestales se ven afectadas depende de la dispersión espacial de las variables involucradas, y también de la escala utilizada en cada caso. Esta tesis revisa las fuentes de error posicional que pueden afectar a los diversos datos de entrada involucrados en un proyecto de inventario forestal basado en teledetección ALS, y como las propiedades del dosel forestal en sí afecta a su magnitud, aconsejando en consecuencia métodos para su reducción. También se incluye una discusión sobre las formas más apropiadas de medir exactitud y precisión en cada caso, y como los errores de posicionamiento de hecho afectan a la calidad de las estimaciones, con vistas a una planificación eficiente de la adquisición de los datos. La optimización final en el posicionamiento GNSS y de la radiometría del sensor óptico permitió detectar la importancia de este ultimo en la predicción de la desidad relativa de un bosque monoespecífico de Pinus sylvestris L. ABSTRACT Most forestry applications of airborne laser scanning (ALS) require the integration and simultaneous use of various data sources, pursuing a variety of different objectives. Projects based on remotely-sensed data generally consist in upscaling data fusion stages: from the most detailed information obtained for a limited area (field plot) to a more uncertain forest response sensed over a larger extent (airborne and satellite swath). All data sources ultimately rely on global navigation satellite systems (GNSS), which are especially error-prone when operating under forest canopies. Other additional processing stages, such as orthorectification, may as well be affected by vegetation, hence deteriorating the accuracy of optical imagery’s reference coordinates. These errors introduce noise to the models, as predictors displace from their corresponding response. The degree to which forest estimations are affected depends on the spatial dispersion of the variables involved and the scale used. This thesis reviews the sources of positioning errors which may affect the different inputs involved in an ALS-assisted forest inventory project, and how the properties of the forest canopy itself affects their magnitude, advising on methods for diminishing them. It is also discussed how accuracy should be assessed, and how positioning errors actually affect forest estimation, toward a cost-efficient planning for data acquisition. The final optimization in positioning the GNSS and optical image allowed to detect the importance of the latter in predicting relative density in a monospecific Pinus sylvestris L. forest.
Resumo:
Recently, experts and practitioners in language resources have started recognizing the benefits of the linked data (LD) paradigm for the representation and exploitation of linguistic data on the Web. The adoption of the LD principles is leading to an emerging ecosystem of multilingual open resources that conform to the Linguistic Linked Open Data Cloud, in which datasets of linguistic data are interconnected and represented following common vocabularies, which facilitates linguistic information discovery, integration and access. In order to contribute to this initiative, this paper summarizes several key aspects of the representation of linguistic information as linked data from a practical perspective. The main goal of this document is to provide the basic ideas and tools for migrating language resources (lexicons, corpora, etc.) as LD on the Web and to develop some useful NLP tasks with them (e.g., word sense disambiguation). Such material was the basis of a tutorial imparted at the EKAW’14 conference, which is also reported in the paper.
Resumo:
We have developed a statistical gap-filling method adapted to the specific coverage and properties of observed fugacity of surface ocean CO2 (fCO2). We have used this method to interpolate the Surface Ocean CO2 Atlas (SOCAT) v2 database on a 2.5°×2.5° global grid (south of 70°N) for 1985-2011 at monthly resolution. The method combines a spatial interpolation based on a 'radius of influence' to determine nearby similar fCO2 values with temporal harmonic and cubic spline curve-fitting, and also fits long term trends and seasonal cycles. Interannual variability is established using deviations of observations from the fitted trends and seasonal cycles. An uncertainty is computed for all interpolated values based on the spatial and temporal range of the interpolation. Tests of the method using model data show that it performs as well as or better than previous regional interpolation methods, but in addition it provides a near-global and interannual coverage.
Resumo:
Foley [J. Opt. Soc. Am. A 11 (1994) 1710] has proposed an influential psychophysical model of masking in which mask components in a contrast gain pool are raised to an exponent before summation and divisive inhibition. We tested this summation rule in experiments in which contrast detection thresholds were measured for a vertical 1 c/deg (or 2 c/deg) sine-wave component in the presence of a 3 c/deg (or 6 c/deg) mask that had either a single component oriented at -45° or a pair of components oriented at ±45°. Contrary to the predictions of Foley's model 3, we found that for masks of moderate contrast and above, threshold elevation was predicted by linear summation of the mask components in the inhibitory stage of the contrast gain pool. We built this feature into two new models, referred to as the early adaptation model and the hybrid model. In the early adaptation model, contrast adaptation controls a threshold-like nonlinearity on the output of otherwise linear pathways that provide the excitatory and inhibitory inputs to a gain control stage. The hybrid model involves nonlinear and nonadaptable routes to excitatory and inhibitory stages as well as an adaptable linear route. With only six free parameters, both models provide excellent fits to the masking and adaptation data of Foley and Chen [Vision Res. 37 (1997) 2779] but unlike Foley and Chen's model, are able to do so with only one adaptation parameter. However, only the hybrid model is able to capture the features of Foley's (1994) pedestal plus orthogonal fixed mask data. We conclude that (1) linear summation of inhibitory components is a feature of contrast masking, and (2) that the main aftereffect of spatial adaptation on contrast increment thresholds can be assigned to a single site. © 2002 Elsevier Science Ltd. All rights reserved.