947 resultados para Integrated Data Repository
Resumo:
Trabajo realizado por Antonio Machado Carrillo, Juan Antonio Bermejo e Ignacio Lorenzo
Resumo:
This dissertation established a software-hardware integrated design for a multisite data repository in pediatric epilepsy. A total of 16 institutions formed a consortium for this web-based application. This innovative fully operational web application allows users to upload and retrieve information through a unique human-computer graphical interface that is remotely accessible to all users of the consortium. A solution based on a Linux platform with My-SQL and Personal Home Page scripts (PHP) has been selected. Research was conducted to evaluate mechanisms to electronically transfer diverse datasets from different hospitals and collect the clinical data in concert with their related functional magnetic resonance imaging (fMRI). What was unique in the approach considered is that all pertinent clinical information about patients is synthesized with input from clinical experts into 4 different forms, which were: Clinical, fMRI scoring, Image information, and Neuropsychological data entry forms. A first contribution of this dissertation was in proposing an integrated processing platform that was site and scanner independent in order to uniformly process the varied fMRI datasets and to generate comparative brain activation patterns. The data collection from the consortium complied with the IRB requirements and provides all the safeguards for security and confidentiality requirements. An 1-MR1-based software library was used to perform data processing and statistical analysis to obtain the brain activation maps. Lateralization Index (LI) of healthy control (HC) subjects in contrast to localization-related epilepsy (LRE) subjects were evaluated. Over 110 activation maps were generated, and their respective LIs were computed yielding the following groups: (a) strong right lateralization: (HC=0%, LRE=18%), (b) right lateralization: (HC=2%, LRE=10%), (c) bilateral: (HC=20%, LRE=15%), (d) left lateralization: (HC=42%, LRE=26%), e) strong left lateralization: (HC=36%, LRE=31%). Moreover, nonlinear-multidimensional decision functions were used to seek an optimal separation between typical and atypical brain activations on the basis of the demographics as well as the extent and intensity of these brain activations. The intent was not to seek the highest output measures given the inherent overlap of the data, but rather to assess which of the many dimensions were critical in the overall assessment of typical and atypical language activations with the freedom to select any number of dimensions and impose any degree of complexity in the nonlinearity of the decision space.
Resumo:
Companies are increasingly more and more dependent on distributed web-based software systems to support their businesses. This increases the need to maintain and extend software systems with up-to-date new features. Thus, the development process to introduce new features usually needs to be swift and agile, and the supporting software evolution process needs to be safe, fast, and efficient. However, this is usually a difficult and challenging task for a developer due to the lack of support offered by programming environments, frameworks, and database management systems. Changes needed at the code level, database model, and the actual data contained in the database must be planned and developed together and executed in a synchronized way. Even under a careful development discipline, the impact of changing an application data model is hard to predict. The lifetime of an application comprises changes and updates designed and tested using data, which is usually far from the real, production, data. So, coding DDL and DML SQL scripts to update database schema and data, is the usual (and hard) approach taken by developers. Such manual approach is error prone and disconnected from the real data in production, because developers may not know the exact impact of their changes. This work aims to improve the maintenance process in the context of Agile Platform by Outsystems. Our goal is to design and implement new data-model evolution features that ensure a safe support for change and a sound migration process. Our solution includes impact analysis mechanisms targeting the data model and the data itself. This provides, to developers, a safe, simple, and guided evolution process.
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Social Computing Data Repository hosts data from a collection of many different social media sites, most of which have blogging capacity. Some of the prominent social media sites included in this repository are BlogCatalog, Twitter, MyBlogLog, Digg, StumbleUpon, del.icio.us, MySpace, LiveJournal, The Unofficial Apple Weblog (TUAW), Reddit, etc. The repository contains various facets of blog data including blog site metadata like, user defined tags, predefined categories, blog site description; blog post level metadata like, user defined tags, date and time of posting; blog posts; blog post mood (which is defined as the blogger's emotions when (s)he wrote the blog post); blogger name; blog post comments; and blogger social network.
Resumo:
Conventional seemingly unrelated estimation of the almost ideal demand system is shown to lead to small sample bias and distortions in the size of a Wald test for symmetry and homogeneity when the data are co-integrated. A fully modified estimator is developed in an attempt to remedy these problems. It is shown that this estimator reduces the small sample bias but fails to eliminate the size distortion.. Bootstrapping is shown to be ineffective as a method of removing small sample bias in both the conventional and fully modified estimators. Bootstrapping is effective, however, as a method of removing. size distortion and performs equally well in this respect with both estimators.
Resumo:
Absolute quantitation of clinical (1)H-MR spectra is virtually always incomplete for single subjects because the separate determination of spectrum, baseline, and transverse and longitudinal relaxation times in single subjects is prohibitively long. Integrated Processing and Acquisition of Data (IPAD) based on a combined 2-dimensional experimental and fitting strategy is suggested to substantially improve the information content from a given measurement time. A series of localized saturation-recovery spectra was recorded and combined with 2-dimensional prior-knowledge fitting to simultaneously determine metabolite T(1) (from analysis of the saturation-recovery time course), metabolite T(2) (from lineshape analysis based on metabolite and water peak shapes), macromolecular baseline (based on T(1) differences and analysis of the saturation-recovery time course), and metabolite concentrations (using prior knowledge fitting and conventional procedures of absolute standardization). The procedure was tested on metabolite solutions and applied in 25 subjects (15-78 years old). Metabolite content was comparable to previously found values. Interindividual variation was larger than intraindividual variation in repeated spectra for metabolite content as well as for some relaxation times. Relaxation times were different for various metabolite groups. Parts of the interindividual variation could be explained by significant age dependence of relaxation times.
Resumo:
For a reliable simulation of the time and space dependent CO2 redistribution between ocean and atmosphere an appropriate time dependent simulation of particle dynamics processes is essential but has not been carried out so far. The major difficulties were the lack of suitable modules for particle dynamics and early diagenesis (in order to close the carbon and nutrient budget) in ocean general circulation models, and the lack of an understanding of biogeochemical processes, such as the partial dissolution of calcareous particles in oversaturated water. The main target of ORFOIS was to fill in this gap in our knowledge and prediction capability infrastructure. This goal has been achieved step by step. At first comprehensive data bases (already existing data) of observations of relevance for the three major types of biogenic particles, organic carbon (POC), calcium carbonate (CaCO3), and biogenic silica (BSi or opal), as well as for refractory particles of terrestrial origin were collated and made publicly available.
Resumo:
The JGOFS International Collection Volume 2: Integrated Data Sets CD is a coherent, organised compilation of existing data sets produced by member countries which participated in JGOFS. In most cases, the data were gathered from the JGOFS International Collection, Volume 1: Discrete Datasets DVD. To produce Vol. 1 data were taken from the original sources and copied "as is" on the DVD. For Vol. 2 data and metadata have been harmonized using the conversion software PanTool and the import routine of PANGAEA checking for completeness of metadata and defining the relations between data and metadata. Prior to the import, data had performed a technical quality control, i.e. format and readability of the file, availability and combination of parameters and units, range of values.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-04
Resumo:
The MAP-i Doctoral Program of the Universities of Minho, Aveiro and Porto
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
A data warehouse is a data repository which collects and maintains a large amount of data from multiple distributed, autonomous and possibly heterogeneous data sources. Often the data is stored in the form of materialized views in order to provide fast access to the integrated data. One of the most important decisions in designing a data warehouse is the selection of views for materialization. The objective is to select an appropriate set of views that minimizes the total query response time with the constraint that the total maintenance time for these materialized views is within a given bound. This view selection problem is totally different from the view selection problem under the disk space constraint. In this paper the view selection problem under the maintenance time constraint is investigated. Two efficient, heuristic algorithms for the problem are proposed. The key to devising the proposed algorithms is to define good heuristic functions and to reduce the problem to some well-solved optimization problems. As a result, an approximate solution of the known optimization problem will give a feasible solution of the original problem. (C) 2001 Elsevier Science B.V. All rights reserved.