29 resultados para Data Repository
Resumo:
Projecte de recerca elaborat a partir d’una estada a la National Oceanography Centre of Southampton (NOCS), Gran Bretanya, entre maig i juliol del 2006. La possibilitat d’obtenir una estimació precissa de la salinitat marina (SSS) és important per a investigar i predir l’extensió del fenòmen del canvi climàtic. La missió Soil Moisture and Ocean Salinity (SMOS) va ser seleccionada per l’Agència Espacial Europea (ESA) per a obtenir mapes de salinitat de la superfície marina a escala global i amb un temps de revisita petit. Abans del llençament de SMOS es preveu l’anàlisi de la variabilitat horitzontal de la SSS i del potencial de les dades recuperades a partir de mesures de SMOS per a reproduir comportaments oceanogràfics coneguts. L’objectiu de tot plegat és emplenar el buit existent entre les fonts de dades d’entrada/auxiliars fiables i les eines desenvolupades per a simular i processar les dades adquirides segons la configuració de SMOS. El SMOS End-to-end Performance Simulator (SEPS) és un simulador adhoc desenvolupat per la Universitat Politècnica de Catalunya (UPC) per a generar dades segons la configuració de SMOS. Es va utilitzar dades d’entrada a SEPS procedents del projecte Ocean Circulation and Climate Advanced Modeling (OCCAM), utilitzat al NOCS, a diferents resolucions espacials. Modificant SEPS per a poder fer servir com a entrada les dades OCCAM es van obtenir dades de temperatura de brillantor simulades durant un mes amb diferents observacions ascendents que cobrien la zona seleccionada. Les tasques realitzades durant l’estada a NOCS tenien la finalitat de proporcionar una tècnica fiable per a realitzar la calibració externa i per tant cancel•lar el bias, una metodologia per a promitjar temporalment les diferents adquisicions durant les observacions ascendents, i determinar la millor configuració de la funció de cost abans d’explotar i investigar les posibiltats de les dades SEPS/OCCAM per a derivar la SSS recuperada amb patrons d’alta resolució.
Resumo:
Un reto al ejecutar las aplicaciones en un cluster es lograr mejorar las prestaciones utilizando los recursos de manera eficiente, y este reto es mayor al utilizar un ambiente distribuido. Teniendo en cuenta este reto, se proponen un conjunto de reglas para realizar el cómputo en cada uno de los nodos, basado en el análisis de cómputo y comunicaciones de las aplicaciones, se analiza un esquema de mapping de celdas y un método para planificar el orden de ejecución, tomando en consideración la ejecución por prioridad, donde las celdas de fronteras tienen una mayor prioridad con respecto a las celdas internas. En la experimentación se muestra el solapamiento del computo interno con las comunicaciones de las celdas fronteras, obteniendo resultados donde el Speedup aumenta y los niveles de eficiencia se mantienen por encima de un 85%, finalmente se obtiene ganancias de los tiempos de ejecución, concluyendo que si se puede diseñar un esquemas de solapamiento que permita que la ejecución de las aplicaciones SPMD en un cluster se hagan de forma eficiente.
Resumo:
E-repositories are part of the e-science, and they are based on the e-infrastructure. The Centre de Supercomputació de Catalunya (CESCA) together with the Consorci de Biblioteques Universitàries de Catalunya (CBUC) started in 1999 a cooperative repository, named TDR, to file, in digital format, the full-text of the read thesis at the universities of our country in order to spread them worldwide in open access, while at the same time, preserving the intellectual copyright of the authors. Since then, four additional cooperative repositories have been created: RECERCAT for research papers; RACO for scientific, cultural and erudite Catalan magazines; MDC for Catalan digital collections of pictures, maps, posters and old magazines; and PADICAT for archiving Catalan digital web content; The main objective of the latter is to archive Catalan web sites. That is, PADICAT collects, processes and provides permanent access to the entire cultural, scientific and general output of Catalonia in digital format. The repository manager is the Biblioteca de Catalunya, as the institution responsible for compiling, processing and distributing the bibliographic heritage of Catalonia, while CESCA is the technology partner. On September 11th, 2006 the repository went into operation for the general public, with some thirty websites archived. After one year and a half, it has 2.720 captures of more than 1.000 websites. This includes 34 million files (HTML, images...) and two terabytes of data. The objective of this paper is to present PADICAT and our experience developing and managing it.We describe the repository briefly, we explain the technology used to implement it and we comment our experiences during its first year and a half.
Resumo:
The objective of this paper is to analyse to what extent the use of cross-section data will distort the estimated elasticities for car ownership demand when the observed variables do not correspond to a state equilibrium for some individuals in the sample. Our proposal consists of approximating the equilibrium values of the observed variables by constructing a pseudo-panel data set which entails averaging individuals observed at different points of time into cohorts. The results show that individual and aggregate data lead to almost the same value for income elasticity, whereas with respect to working adult elasticity the similarity is less pronounced.
Resumo:
Report for the scientific sojourn carried out at the University of New South Wales from February to June the 2007. Two different biogeochemical models are coupled to a three dimensional configuration of the Princeton Ocean Model (POM) for the Northwestern Mediterranean Sea (Ahumada and Cruzado, 2007). The first biogeochemical model (BLANES) is the three-dimensional version of the model described by Bahamon and Cruzado (2003) and computes the nitrogen fluxes through six compartments using semi-empirical descriptions of biological processes. The second biogeochemical model (BIOMEC) is the biomechanical NPZD model described in Baird et al. (2004), which uses a combination of physiological and physical descriptions to quantify the rates of planktonic interactions. Physical descriptions include, for example, the diffusion of nutrients to phytoplankton cells and the encounter rate of predators and prey. The link between physical and biogeochemical processes in both models is expressed by the advection-diffusion of the non-conservative tracers. The similarities in the mathematical formulation of the biogeochemical processes in the two models are exploited to determine the parameter set for the biomechanical model that best fits the parameter set used in the first model. Three years of integration have been carried out for each model to reach the so called perpetual year run for biogeochemical conditions. Outputs from both models are averaged monthly and then compared to remote sensing images obtained from sensor MERIS for chlorophyll.
Resumo:
This paper develops a methodology to estimate the entire population distributions from bin-aggregated sample data. We do this through the estimation of the parameters of mixtures of distributions that allow for maximal parametric flexibility. The statistical approach we develop enables comparisons of the full distributions of height data from potential army conscripts across France's 88 departments for most of the nineteenth century. These comparisons are made by testing for differences-of-means stochastic dominance. Corrections for possible measurement errors are also devised by taking advantage of the richness of the data sets. Our methodology is of interest to researchers working on historical as well as contemporary bin-aggregated or histogram-type data, something that is still widely done since much of the information that is publicly available is in that form, often due to restrictions due to political sensitivity and/or confidentiality concerns.
Resumo:
In this paper we analyze the persistence of aggregate real exchange rates (RERs) for a group of EU-15 countries by using sectoral data. The tight relation between aggregate and sectoral persistence recently investigated by Mayoral (2008) allows us to decompose aggregate RER persistence into the persistence of its different subcomponents. We show that the distribution of sectoral persistence is highly heterogeneous and very skewed to the right, and that a limited number of sectors are responsible for the high levels of persistence observed at the aggregate level. We use quantile regression to investigate whether the traditional theories proposed to account for the slow reversion to parity (lack of arbitrage due to nontradibilities or imperfect competition and price stickiness) are able to explain the behavior of the upper quantiles of sectoral persistence. We conclude that pricing to market in the intermediate goods sector together with price stickiness have more explanatory power than variables related to the tradability of the goods or their inputs.
Resumo:
Consider a model with parameter phi, and an auxiliary model with parameter theta. Let phi be a randomly sampled from a given density over the known parameter space. Monte Carlo methods can be used to draw simulated data and compute the corresponding estimate of theta, say theta_tilde. A large set of tuples (phi, theta_tilde) can be generated in this manner. Nonparametric methods may be use to fit the function E(phi|theta_tilde=a), using these tuples. It is proposed to estimate phi using the fitted E(phi|theta_tilde=theta_hat), where theta_hat is the auxiliary estimate, using the real sample data. This is a consistent and asymptotically normally distributed estimator, under certain assumptions. Monte Carlo results for dynamic panel data and vector autoregressions show that this estimator can have very attractive small sample properties. Confidence intervals can be constructed using the quantiles of the phi for which theta_tilde is close to theta_hat. Such confidence intervals are found to have very accurate coverage.
Resumo:
In this paper we describe an open learning object repository on Statistics based on DSpace which contains true learning objects, that is, exercises, equations, data sets, etc. This repository is part of a large project intended to promote the use of learning object repositories as part of the learning process in virtual learning environments. This involves the creation of a new user interface that provides users with additional services such as resource rating, commenting and so. Both aspects make traditional metadata schemes such as Dublin Core to be inadequate, as there are resources with no title or author, for instance, as those fields are not used by learners to browse and search for learning resources in the repository. Therefore, exporting OAI-PMH compliant records using OAI-DC is not possible, thus limiting the visibility of the learning objects in the repository outside the institution. We propose an architecture based on ontologies and the use of extended metadata records for both storing and refactoring such descriptions.
Resumo:
In this paper we construct a data set on EU cohesion aid to Spain during the planning period 2000-06. The data are disaggregated by region, year and function and attempt to approximate the timing of actual executed expenditure on assisted projects.
Resumo:
An increasing number of studies have sprung up in recent years seeking to identify individual inventors from patent data. Different heuristics have been suggested to use their names and other information disclosed in patent documents in order to find out “who is who” in patents. This paper contributes to this literature by setting forth a methodology to identify them using patents applied to the European Patent Office (EPO hereafter). As in the large part of this literature, we basically follow a three-steps procedure: (1) the parsing stage, aimed at reducing the noise in the inventor’s name and other fields of the patent; (2) the matching stage, where name matching algorithms are used to group possible similar names; (3) the filtering stage, where additional information and different scoring schemes are used to filter out these potential same inventors. The paper includes some figures resulting of applying the algorithms to the set of European inventors applying to the EPO for a large period of time.
Resumo:
In this work discuss the use of the standard model for the calculation of the solvency capital requirement (SCR) when the company aims to use the specific parameters of the model on the basis of the experience of its portfolio. In particular, this analysis focuses on the formula presented in the latest quantitative impact study (2010 CEIOPS) for non-life underwriting premium and reserve risk. One of the keys of the standard model for premium and reserves risk is the correlation matrix between lines of business. In this work we present how the correlation matrix between lines of business could be estimated from a quantitative perspective, as well as the possibility of using a credibility model for the estimation of the matrix of correlation between lines of business that merge qualitative and quantitative perspective.
Resumo:
We use historical data that cover more than one century on real GDP for industrial countries and employ the Pesaran panel unit root test that allows for cross-sectional dependence to test for a unit root on real GDP. We find strong evidence against the unit root null. Our results are robust to the chosen group of countries and the sample period. Key words: real GDP stationarity, cross-sectional dependence, CIPS test. JEL Classification: C23, E32
Resumo:
L'objectiu que es proposa aquest document és conèixer la problemàtica de la persistència dels objectes, trobar i estudiar les diferents solucions existents i estudiar-ne una de concreta, la capa de persistència JDO.