933 resultados para database integration


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Des progrès significatifs ont été réalisés dans le domaine de l'intégration quantitative des données géophysique et hydrologique l'échelle locale. Cependant, l'extension à de plus grandes échelles des approches correspondantes constitue encore un défi majeur. Il est néanmoins extrêmement important de relever ce défi pour développer des modèles fiables de flux des eaux souterraines et de transport de contaminant. Pour résoudre ce problème, j'ai développé une technique d'intégration des données hydrogéophysiques basée sur une procédure bayésienne de simulation séquentielle en deux étapes. Cette procédure vise des problèmes à plus grande échelle. L'objectif est de simuler la distribution d'un paramètre hydraulique cible à partir, d'une part, de mesures d'un paramètre géophysique pertinent qui couvrent l'espace de manière exhaustive, mais avec une faible résolution (spatiale) et, d'autre part, de mesures locales de très haute résolution des mêmes paramètres géophysique et hydraulique. Pour cela, mon algorithme lie dans un premier temps les données géophysiques de faible et de haute résolution à travers une procédure de réduction déchelle. Les données géophysiques régionales réduites sont ensuite reliées au champ du paramètre hydraulique à haute résolution. J'illustre d'abord l'application de cette nouvelle approche dintégration des données à une base de données synthétiques réaliste. Celle-ci est constituée de mesures de conductivité hydraulique et électrique de haute résolution réalisées dans les mêmes forages ainsi que destimations des conductivités électriques obtenues à partir de mesures de tomographic de résistivité électrique (ERT) sur l'ensemble de l'espace. Ces dernières mesures ont une faible résolution spatiale. La viabilité globale de cette méthode est testée en effectuant les simulations de flux et de transport au travers du modèle original du champ de conductivité hydraulique ainsi que du modèle simulé. Les simulations sont alors comparées. Les résultats obtenus indiquent que la procédure dintégration des données proposée permet d'obtenir des estimations de la conductivité en adéquation avec la structure à grande échelle ainsi que des predictions fiables des caractéristiques de transports sur des distances de moyenne à grande échelle. Les résultats correspondant au scénario de terrain indiquent que l'approche d'intégration des données nouvellement mise au point est capable d'appréhender correctement les hétérogénéitées à petite échelle aussi bien que les tendances à gande échelle du champ hydraulique prévalent. Les résultats montrent également une flexibilté remarquable et une robustesse de cette nouvelle approche dintégration des données. De ce fait, elle est susceptible d'être appliquée à un large éventail de données géophysiques et hydrologiques, à toutes les gammes déchelles. Dans la deuxième partie de ma thèse, j'évalue en détail la viabilité du réechantillonnage geostatique séquentiel comme mécanisme de proposition pour les méthodes Markov Chain Monte Carlo (MCMC) appliquées à des probmes inverses géophysiques et hydrologiques de grande dimension . L'objectif est de permettre une quantification plus précise et plus réaliste des incertitudes associées aux modèles obtenus. En considérant une série dexemples de tomographic radar puits à puits, j'étudie deux classes de stratégies de rééchantillonnage spatial en considérant leur habilité à générer efficacement et précisément des réalisations de la distribution postérieure bayésienne. Les résultats obtenus montrent que, malgré sa popularité, le réechantillonnage séquentiel est plutôt inefficace à générer des échantillons postérieurs indépendants pour des études de cas synthétiques réalistes, notamment pour le cas assez communs et importants où il existe de fortes corrélations spatiales entre le modèle et les paramètres. Pour résoudre ce problème, j'ai développé un nouvelle approche de perturbation basée sur une déformation progressive. Cette approche est flexible en ce qui concerne le nombre de paramètres du modèle et lintensité de la perturbation. Par rapport au rééchantillonage séquentiel, cette nouvelle approche s'avère être très efficace pour diminuer le nombre requis d'itérations pour générer des échantillons indépendants à partir de la distribution postérieure bayésienne. - Significant progress has been made with regard to the quantitative integration of geophysical and hydrological data at the local scale. However, extending corresponding approaches beyond the local scale still represents a major challenge, yet is critically important for the development of reliable groundwater flow and contaminant transport models. To address this issue, I have developed a hydrogeophysical data integration technique based on a two-step Bayesian sequential simulation procedure that is specifically targeted towards larger-scale problems. The objective is to simulate the distribution of a target hydraulic parameter based on spatially exhaustive, but poorly resolved, measurements of a pertinent geophysical parameter and locally highly resolved, but spatially sparse, measurements of the considered geophysical and hydraulic parameters. To this end, my algorithm links the low- and high-resolution geophysical data via a downscaling procedure before relating the downscaled regional-scale geophysical data to the high-resolution hydraulic parameter field. I first illustrate the application of this novel data integration approach to a realistic synthetic database consisting of collocated high-resolution borehole measurements of the hydraulic and electrical conductivities and spatially exhaustive, low-resolution electrical conductivity estimates obtained from electrical resistivity tomography (ERT). The overall viability of this method is tested and verified by performing and comparing flow and transport simulations through the original and simulated hydraulic conductivity fields. The corresponding results indicate that the proposed data integration procedure does indeed allow for obtaining faithful estimates of the larger-scale hydraulic conductivity structure and reliable predictions of the transport characteristics over medium- to regional-scale distances. The approach is then applied to a corresponding field scenario consisting of collocated high- resolution measurements of the electrical conductivity, as measured using a cone penetrometer testing (CPT) system, and the hydraulic conductivity, as estimated from electromagnetic flowmeter and slug test measurements, in combination with spatially exhaustive low-resolution electrical conductivity estimates obtained from surface-based electrical resistivity tomography (ERT). The corresponding results indicate that the newly developed data integration approach is indeed capable of adequately capturing both the small-scale heterogeneity as well as the larger-scale trend of the prevailing hydraulic conductivity field. The results also indicate that this novel data integration approach is remarkably flexible and robust and hence can be expected to be applicable to a wide range of geophysical and hydrological data at all scale ranges. In the second part of my thesis, I evaluate in detail the viability of sequential geostatistical resampling as a proposal mechanism for Markov Chain Monte Carlo (MCMC) methods applied to high-dimensional geophysical and hydrological inverse problems in order to allow for a more accurate and realistic quantification of the uncertainty associated with the thus inferred models. Focusing on a series of pertinent crosshole georadar tomographic examples, I investigated two classes of geostatistical resampling strategies with regard to their ability to efficiently and accurately generate independent realizations from the Bayesian posterior distribution. The corresponding results indicate that, despite its popularity, sequential resampling is rather inefficient at drawing independent posterior samples for realistic synthetic case studies, notably for the practically common and important scenario of pronounced spatial correlation between model parameters. To address this issue, I have developed a new gradual-deformation-based perturbation approach, which is flexible with regard to the number of model parameters as well as the perturbation strength. Compared to sequential resampling, this newly proposed approach was proven to be highly effective in decreasing the number of iterations required for drawing independent samples from the Bayesian posterior distribution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Significant progress has been made with regard to the quantitative integration of geophysical and hydrological data at the local scale for the purpose of improving predictions of groundwater flow and solute transport. However, extending corresponding approaches to the regional scale still represents one of the major challenges in the domain of hydrogeophysics. To address this problem, we have developed a regional-scale data integration methodology based on a two-step Bayesian sequential simulation approach. Our objective is to generate high-resolution stochastic realizations of the regional-scale hydraulic conductivity field in the common case where there exist spatially exhaustive but poorly resolved measurements of a related geophysical parameter, as well as highly resolved but spatially sparse collocated measurements of this geophysical parameter and the hydraulic conductivity. To integrate this multi-scale, multi-parameter database, we first link the low- and high-resolution geophysical data via a stochastic downscaling procedure. This is followed by relating the downscaled geophysical data to the high-resolution hydraulic conductivity distribution. After outlining the general methodology of the approach, we demonstrate its application to a realistic synthetic example where we consider as data high-resolution measurements of the hydraulic and electrical conductivities at a small number of borehole locations, as well as spatially exhaustive, low-resolution estimates of the electrical conductivity obtained from surface-based electrical resistivity tomography. The different stochastic realizations of the hydraulic conductivity field obtained using our procedure are validated by comparing their solute transport behaviour with that of the underlying ?true? hydraulic conductivity field. We find that, even in the presence of strong subsurface heterogeneity, our proposed procedure allows for the generation of faithful representations of the regional-scale hydraulic conductivity structure and reliable predictions of solute transport over long, regional-scale distances.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Access to online repositories for genomic and associated "-omics" datasets is now an essential part of everyday research activity. It is important therefore that the Tuberculosis community is aware of the databases and tools available to them online, as well as for the database hosts to know what the needs of the research community are. One of the goals of the Tuberculosis Annotation Jamboree, held in Washington DC on March 7th-8th 2012, was therefore to provide an overview of the current status of three key Tuberculosis resources, TubercuList (tuberculist.epfl.ch), TB Database (www.tbdb.org), and Pathosystems Resource Integration Center (PATRIC, www.patricbrc.org). Here we summarize some key updates and upcoming features in TubercuList, and provide an overview of the PATRIC site and its online tools for pathogen RNA-Seq analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since 2008, Intelligence units of six states of the western part of Switzerland have been sharing a common database for the analysis of high volume crimes. On a daily basis, events reported to the police are analysed, filtered and classified to detect crime repetitions and interpret the crime environment. Several forensic outcomes are integrated in the system such as matches of traces with persons, and links between scenes detected by the comparison of forensic case data. Systematic procedures have been settled to integrate links assumed mainly through DNA profiles, shoemarks patterns and images. A statistical outlook on a retrospective dataset of series from 2009 to 2011 of the database informs for instance on the number of repetition detected or confirmed and increased by forensic case data. Time needed to obtain forensic intelligence in regard with the type of marks treated, is seen as a critical issue. Furthermore, the underlying integration process of forensic intelligence into the crime intelligence database raised several difficulties in regards of the acquisition of data and the models used in the forensic databases. Solutions found and adopted operational procedures are described and discussed. This process form the basis to many other researches aimed at developing forensic intelligence models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Geophysical techniques can help to bridge the inherent gap with regard to spatial resolution and the range of coverage that plagues classical hydrological methods. This has lead to the emergence of the new and rapidly growing field of hydrogeophysics. Given the differing sensitivities of various geophysical techniques to hydrologically relevant parameters and their inherent trade-off between resolution and range the fundamental usefulness of multi-method hydrogeophysical surveys for reducing uncertainties in data analysis and interpretation is widely accepted. A major challenge arising from such endeavors is the quantitative integration of the resulting vast and diverse database in order to obtain a unified model of the probed subsurface region that is internally consistent with all available data. To address this problem, we have developed a strategy towards hydrogeophysical data integration based on Monte-Carlo-type conditional stochastic simulation that we consider to be particularly suitable for local-scale studies characterized by high-resolution and high-quality datasets. Monte-Carlo-based optimization techniques are flexible and versatile, allow for accounting for a wide variety of data and constraints of differing resolution and hardness and thus have the potential of providing, in a geostatistical sense, highly detailed and realistic models of the pertinent target parameter distributions. Compared to more conventional approaches of this kind, our approach provides significant advancements in the way that the larger-scale deterministic information resolved by the hydrogeophysical data can be accounted for, which represents an inherently problematic, and as of yet unresolved, aspect of Monte-Carlo-type conditional simulation techniques. We present the results of applying our algorithm to the integration of porosity log and tomographic crosshole georadar data to generate stochastic realizations of the local-scale porosity structure. Our procedure is first tested on pertinent synthetic data and then applied to corresponding field data collected at the Boise Hydrogeophysical Research Site near Boise, Idaho, USA.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Diplomityössä on tutkittu reaaliaikaisen toimintolaskennan toteuttamista suomalaisen lasersiruja valmistavan PK-yrityksen tietojärjestelmään. Lisäksi on tarkasteltu toimintolaskennan vaikutuksia operatiiviseen toimintaan sekä toimintojen johtamiseen. Työn kirjallisuusosassa on käsitelty kirjallisuuslähteiden perusteella toimintolaskennan teorioita, laskentamenetelmiä sekä teknisessä toteutuksessa käytettyjä teknologioita. Työn toteutusosassa suunniteltiin ja toteutettiin WWW-pohjainen toimintolaskentajärjestelmä case-yrityksen kustannuslaskennan sekä taloushallinnon avuksi. Työkalu integroitiin osaksi yrityksen toiminnanohjaus- sekä valmistuksenohjausjärjestelmää. Perinteisiin toimintolaskentamallien tiedonkeruujärjestelmiin verrattuna case-yrityksessä syötteet toimintolaskentajärjestelmälle tulevat reaaliaikaisesti osana suurempaa tietojärjestelmäintegraatiota.Diplomityö pyrkii luomaan suhteen toimintolaskennan vaatimusten ja tietokantajärjestelmien välille. Toimintolaskentajärjestelmää yritys voi hyödyntää esimerkiksi tuotteiden hinnoittelussa ja kustannuslaskennassa näkemällä tuotteisiin liittyviä kustannuksia eri näkökulmista. Päätelmiä voidaan tehdä tarkkaan kustannusinformaatioon perustuen sekä määrittää järjestelmän tuottaman datan perusteella, onko tietyn projektin, asiakkuuden tai tuotteen kehittäminen taloudellisesti kannattavaa.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present a P2P-based database sharing system that provides information sharing capabilities through keyword-based search techniques. Our system requires neither a global schema nor schema mappings between different databases, and our keyword-based search algorithms are robust in the presence of frequent changes in the content and membership of peers. To facilitate data integration, we introduce keyword join operator to combine partial answers containing different keywords into complete answers. We also present an efficient algorithm that optimize the keyword join operations for partial answer integration. Our experimental study on both real and synthetic datasets demonstrates the effectiveness of our algorithms, and the efficiency of the proposed query processing strategies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Brain activity can be measured non-invasively with functional imaging techniques. Each pixel in such an image represents a neural mass of about 105 to 107 neurons. Mean field models (MFMs) approximate their activity by averaging out neural variability while retaining salient underlying features, like neurotransmitter kinetics. However, MFMs incorporating the regional variability, realistic geometry and connectivity of cortex have so far appeared intractable. This lack of biological realism has led to a focus on gross temporal features of the EEG. We address these impediments and showcase a "proof of principle" forward prediction of co-registered EEG/fMRI for a full-size human cortex in a realistic head model with anatomical connectivity, see figure 1. MFMs usually assume homogeneous neural masses, isotropic long-range connectivity and simplistic signal expression to allow rapid computation with partial differential equations. But these approximations are insufficient in particular for the high spatial resolution obtained with fMRI, since different cortical areas vary in their architectonic and dynamical properties, have complex connectivity, and can contribute non-trivially to the measured signal. Our code instead supports the local variation of model parameters and freely chosen connectivity for many thousand triangulation nodes spanning a cortical surface extracted from structural MRI. This allows the introduction of realistic anatomical and physiological parameters for cortical areas and their connectivity, including both intra- and inter-area connections. Proper cortical folding and conduction through a realistic head model is then added to obtain accurate signal expression for a comparison to experimental data. To showcase the synergy of these computational developments, we predict simultaneously EEG and fMRI BOLD responses by adding an established model for neurovascular coupling and convolving "Balloon-Windkessel" hemodynamics. We also incorporate regional connectivity extracted from the CoCoMac database [1]. Importantly, these extensions can be easily adapted according to future insights and data. Furthermore, while our own simulation is based on one specific MFM [2], the computational framework is general and can be applied to models favored by the user. Finally, we provide a brief outlook on improving the integration of multi-modal imaging data through iterative fits of a single underlying MFM in this realistic simulation framework.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In order to best utilize the limited resource of medical resources, and to reduce the cost and improve the quality of medical treatment, we propose to build an interoperable regional healthcare systems among several levels of medical treatment organizations. In this paper, our approaches are as follows:(1) the ontology based approach is introduced as the methodology and technological solution for information integration; (2) the integration framework of data sharing among different organizations are proposed(3)the virtual database to realize data integration of hospital information system is established. Our methods realize the effective management and integration of the medical workflow and the mass information in the interoperable regional healthcare system. Furthermore, this research provides the interoperable regional healthcare system with characteristic of modularization, expansibility and the stability of the system is enhanced by hierarchy structure.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Major Depressive Disorder (MDD) is among the most prevalent and disabling medical conditions worldwide. Identification of clinical and biological markers (“biomarkers”) of treatment response could personalize clinical decisions and lead to better outcomes. This paper describes the aims, design, and methods of a discovery study of biomarkers in antidepressant treatment response, conducted by the Canadian Biomarker Integration Network in Depression (CAN-BIND). The CAN-BIND research program investigates and identifies biomarkers that help to predict outcomes in patients with MDD treated with antidepressant medication. The primary objective of this initial study (known as CAN-BIND-1) is to identify individual and integrated neuroimaging, electrophysiological, molecular, and clinical predictors of response to sequential antidepressant monotherapy and adjunctive therapy in MDD. Methods CAN-BIND-1 is a multisite initiative involving 6 academic health centres working collaboratively with other universities and research centres. In the 16-week protocol, patients with MDD are treated with a first-line antidepressant (escitalopram 10–20 mg/d) that, if clinically warranted after eight weeks, is augmented with an evidence-based, add-on medication (aripiprazole 2–10 mg/d). Comprehensive datasets are obtained using clinical rating scales; behavioural, dimensional, and functioning/quality of life measures; neurocognitive testing; genomic, genetic, and proteomic profiling from blood samples; combined structural and functional magnetic resonance imaging; and electroencephalography. De-identified data from all sites are aggregated within a secure neuroinformatics platform for data integration, management, storage, and analyses. Statistical analyses will include multivariate and machine-learning techniques to identify predictors, moderators, and mediators of treatment response. Discussion From June 2013 to February 2015, a cohort of 134 participants (85 outpatients with MDD and 49 healthy participants) has been evaluated at baseline. The clinical characteristics of this cohort are similar to other studies of MDD. Recruitment at all sites is ongoing to a target sample of 290 participants. CAN-BIND will identify biomarkers of treatment response in MDD through extensive clinical, molecular, and imaging assessments, in order to improve treatment practice and clinical outcomes. It will also create an innovative, robust platform and database for future research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Usually, a Petri net is applied as an RFID model tool. This paper, otherwise, presents another approach to the Petri net concerning RFID systems. This approach, called elementary Petri net inside an RFID distributed database, or PNRD, is the first step to improve RFID and control systems integration, based on a formal data structure to identify and update the product state in real-time process execution, allowing automatic discovery of unexpected events during tag data capture. There are two main features in this approach: to use RFID tags as the object process expected database and last product state identification; and to apply Petri net analysis to automatically update the last product state registry during reader data capture. RFID reader data capture can be viewed, in Petri nets, as a direct analysis of locality for a specific transition that holds in a specific workflow. Following this direction, RFID readers storage Petri net control vector list related to each tag id is expected to be perceived. This paper presents PNRD cornerstones and a PNRD implementation example in software called DEMIS Distributed Environment in Manufacturing Information Systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A myriad of methods are available for virtual screening of small organic compound databases. In this study we have successfully applied a quantitative model of consensus measurements, using a combination of 3D similarity searches (ROCS and EON), Hologram Quantitative Structure Activity Relationships (HQSAR) and docking (FRED, FlexX, Glide and AutoDock Vina), to retrieve cruzain inhibitors from collected databases. All methods were assessed individually and then combined in a Ligand-Based Virtual Screening (LBVS) and Target-Based Virtual Screening (TBVS) consensus scoring, using Receiving Operating Characteristic (ROC) curves to evaluate their performance. Three consensus strategies were used: scaled-rank-by-number, rank-by-rank and rank-by-vote, with the most thriving the scaled-rank-by-number strategy, considering that the stiff ROC curve appeared to be satisfactory in every way to indicate a higher enrichment power at early retrieval of active compounds from the database. The ligand-based method provided access to a robust and predictive HQSAR model that was developed to show superior discrimination between active and inactive compounds, which was also better than ROCS and EON procedures. Overall, the integration of fast computational techniques based on ligand and target structures resulted in a more efficient retrieval of cruzain inhibitors with desired pharmacological profiles that may be useful to advance the discovery of new trypanocidal agents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Online geographic-databases have been growing increasingly as they have become a crucial source of information for both social networks and safety-critical systems. Since the quality of such applications is largely related to the richness and completeness of their data, it becomes imperative to develop adaptable and persistent storage systems, able to make use of several sources of information as well as enabling the fastest possible response from them. This work will create a shared and extensible geographic model, able to retrieve and store information from the major spatial sources available. A geographic-based system also has very high requirements in terms of scalability, computational power and domain complexity, causing several difficulties for a traditional relational database as the number of results increases. NoSQL systems provide valuable advantages for this scenario, in particular graph databases which are capable of modeling vast amounts of inter-connected data while providing a very substantial increase of performance for several spatial requests, such as finding shortestpath routes and performing relationship lookups with high concurrency. In this work, we will analyze the current state of geographic information systems and develop a unified geographic model, named GeoPlace Explorer (GE). GE is able to import and store spatial data from several online sources at a symbolic level in both a relational and a graph databases, where several stress tests were performed in order to find the advantages and disadvantages of each database paradigm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a technique to share the data stored in an object-oriented database aimed at designing environments. This technique shares data between two related databases, called the Original and Product databases, and is composed of three processes: data separation, evolution and integration. Whenever a block of data needs to be shared, it is spread into both databases, resulting in a block on the original database, and another into the Product database, with special links between them controlled by the Object Manager. These blocks do not need to be maintained identical during the evolution phase of the sharing process. Six types of links were defined, and by choosing one, the designer control the evolution and reintegration of the block in both databases. This process uses the composite object concept as the unit of control. The presented concepts can be applied to any data model with support to composite objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A significant set of information stored in different databases around the world, can be shared through peer-topeer databases. With that, is obtained a large base of knowledge, without the need for large investments because they are used existing databases, as well as the infrastructure in place. However, the structural characteristics of peer-topeer, makes complex the process of finding such information. On the other side, these databases are often heterogeneous in their schemas, but semantically similar in their content. A good peer-to-peer databases systems should allow the user access information from databases scattered across the network and receive only the information really relate to your topic of interest. This paper proposes to use ontologies in peer-to-peer database queries to represent the semantics inherent to the data. The main contribution of this work is enable integration between heterogeneous databases, improve the performance of such queries and use the algorithm of optimization Ant Colony to solve the problem of locating information on peer-to-peer networks, which presents an improve of 18% in results. © 2011 IEEE.