895 resultados para Data linkage
Resumo:
Huge amount of data are generated from a variety of information sources in healthcare while the data sources originate from a veracity of clinical information systems and corporate data warehouses. The data derived from the above data sources are used for analysis and trending purposes thus playing an influential role as a real time decision-making tool. The unstructured, narrative data provided by these data sources qualify as healthcare big-data and researchers argue that the application of big-data in healthcare might enable the accountability and efficiency.
Resumo:
Distributed systems are widely used for solving large-scale and data-intensive computing problems, including all-to-all comparison (ATAC) problems. However, when used for ATAC problems, existing computational frameworks such as Hadoop focus on load balancing for allocating comparison tasks, without careful consideration of data distribution and storage usage. While Hadoop-based solutions provide users with simplicity of implementation, their inherent MapReduce computing pattern does not match the ATAC pattern. This leads to load imbalances and poor data locality when Hadoop's data distribution strategy is used for ATAC problems. Here we present a data distribution strategy which considers data locality, load balancing and storage savings for ATAC computing problems in homogeneous distributed systems. A simulated annealing algorithm is developed for data distribution and task scheduling. Experimental results show a significant performance improvement for our approach over Hadoop-based solutions.
Resumo:
The increase in data center dependent services has made energy optimization of data centers one of the most exigent challenges in today's Information Age. The necessity of green and energy-efficient measures is very high for reducing carbon footprint and exorbitant energy costs. However, inefficient application management of data centers results in high energy consumption and low resource utilization efficiency. Unfortunately, in most cases, deploying an energy-efficient application management solution inevitably degrades the resource utilization efficiency of the data centers. To address this problem, a Penalty-based Genetic Algorithm (GA) is presented in this paper to solve a defined profile-based application assignment problem whilst maintaining a trade-off between the power consumption performance and resource utilization performance. Case studies show that the penalty-based GA is highly scalable and provides 16% to 32% better solutions than a greedy algorithm.
Resumo:
In the past few years, the virtual machine (VM) placement problem has been studied intensively and many algorithms for the VM placement problem have been proposed. However, those proposed VM placement algorithms have not been widely used in today's cloud data centers as they do not consider the migration cost from current VM placement to the new optimal VM placement. As a result, the gain from optimizing VM placement may be less than the loss of the migration cost from current VM placement to the new VM placement. To address this issue, this paper presents a penalty-based genetic algorithm (GA) for the VM placement problem that considers the migration cost in addition to the energy-consumption of the new VM placement and the total inter-VM traffic flow in the new VM placement. The GA has been implemented and evaluated by experiments, and the experimental results show that the GA outperforms two well known algorithms for the VM placement problem.
Resumo:
Although live VM migration has been intensively studied, the problem of live migration of multiple interdependent VMs has hardly been investigated. The most important problem in the live migration of multiple interdependent VMs is how to schedule VM migrations as the schedule will directly affect the total migration time and the total downtime of those VMs. Aiming at minimizing both the total migration time and the total downtime simultaneously, this paper presents a Strength Pareto Evolutionary Algorithm 2 (SPEA2) for the multi-VM migration scheduling problem. The SPEA2 has been evaluated by experiments, and the experimental results show that the SPEA2 can generate a set of VM migration schedules with a shorter total migration time and a shorter total downtime than an existing genetic algorithm, namely Random Key Genetic Algorithm (RKGA). This paper also studies the scalability of the SPEA2.
Resumo:
The concept of big data has already outperformed traditional data management efforts in almost all industries. Other instances it has succeeded in obtaining promising results that provide value from large-scale integration and analysis of heterogeneous data sources for example Genomic and proteomic information. Big data analytics have become increasingly important in describing the data sets and analytical techniques in software applications that are so large and complex due to its significant advantages including better business decisions, cost reduction and delivery of new product and services [1]. In a similar context, the health community has experienced not only more complex and large data content, but also information systems that contain a large number of data sources with interrelated and interconnected data attributes. That have resulted in challenging, and highly dynamic environments leading to creation of big data with its enumerate complexities, for instant sharing of information with the expected security requirements of stakeholders. When comparing big data analysis with other sectors, the health sector is still in its early stages. Key challenges include accommodating the volume, velocity and variety of healthcare data with the current deluge of exponential growth. Given the complexity of big data, it is understood that while data storage and accessibility are technically manageable, the implementation of Information Accountability measures to healthcare big data might be a practical solution in support of information security, privacy and traceability measures. Transparency is one important measure that can demonstrate integrity which is a vital factor in the healthcare service. Clarity about performance expectations is considered to be another Information Accountability measure which is necessary to avoid data ambiguity and controversy about interpretation and finally, liability [2]. According to current studies [3] Electronic Health Records (EHR) are key information resources for big data analysis and is also composed of varied co-created values [3]. Common healthcare information originates from and is used by different actors and groups that facilitate understanding of the relationship for other data sources. Consequently, healthcare services often serve as an integrated service bundle. Although a critical requirement in healthcare services and analytics, it is difficult to find a comprehensive set of guidelines to adopt EHR to fulfil the big data analysis requirements. Therefore as a remedy, this research work focus on a systematic approach containing comprehensive guidelines with the accurate data that must be provided to apply and evaluate big data analysis until the necessary decision making requirements are fulfilled to improve quality of healthcare services. Hence, we believe that this approach would subsequently improve quality of life.
Resumo:
With the ever increasing amount of eHealth data available from various eHealth systems and sources, Health Big Data Analytics promises enticing benefits such as enabling the discovery of new treatment options and improved decision making. However, concerns over the privacy of information have hindered the aggregation of this information. To address these concerns, we propose the use of Information Accountability protocols to provide patients with the ability to decide how and when their data can be shared and aggregated for use in big data research. In this paper, we discuss the issues surrounding Health Big Data Analytics and propose a consent-based model to address privacy concerns to aid in achieving the promised benefits of Big Data in eHealth.
Resumo:
Concerns over the security and privacy of patient information are one of the biggest hindrances to sharing health information and the wide adoption of eHealth systems. At present, there are competing requirements between healthcare consumers' (i.e. patients) requirements and healthcare professionals' (HCP) requirements. While consumers want control over their information, healthcare professionals want access to as much information as required in order to make well-informed decisions and provide quality care. In order to balance these requirements, the use of an Information Accountability Framework devised for eHealth systems has been proposed. In this paper, we take a step closer to the adoption of the Information Accountability protocols and demonstrate their functionality through an implementation in FluxMED, a customisable EHR system.
Resumo:
Genome-wide association studies (GWAS) have identified around 60 common variants associated with multiple sclerosis (MS), but these loci only explain a fraction of the heritability of MS. Some missing heritability may be caused by rare variants that have been suggested to play an important role in the aetiology of complex diseases such as MS. However current genetic and statistical methods for detecting rare variants are expensive and time consuming. 'Population-based linkage analysis' (PBLA) or so called identity-by-descent (IBD) mapping is a novel way to detect rare variants in extant GWAS datasets. We employed BEAGLE fastIBD to search for rare MS variants utilising IBD mapping in a large GWAS dataset of 3,543 cases and 5,898 controls. We identified a genome-wide significant linkage signal on chromosome 19 (LOD = 4.65; p = 1.9×10-6). Network analysis of cases and controls sharing haplotypes on chromosome 19 further strengthened the association as there are more large networks of cases sharing haplotypes than controls. This linkage region includes a cluster of zinc finger genes of unknown function. Analysis of genome wide transcriptome data suggests that genes in this zinc finger cluster may be involved in very early developmental regulation of the CNS. Our study also indicates that BEAGLE fastIBD allowed identification of rare variants in large unrelated population with moderate computational intensity. Even with the development of whole-genome sequencing, IBD mapping still may be a promising way to narrow down the region of interest for sequencing priority. © 2013 Lin et al.
Resumo:
Multiple sclerosis (MS) is a debilitating, chronic demyelinating disease of the central nervous system affecting over 2 million people worldwide. The TAM family of receptor tyrosine kinases (TYRO3, AXL and MERTK) have been implicated as important players during demyelination in both animal models of MS and in the human disease. We therefore conducted an association study to identify single nucleotide polymorphisms (SNPs) within genes encoding the TAM receptors and their ligands associated with MS. Analysis of genotype data from a genome-wide association study which consisted of 1618 MS cases and 3413 healthy controls conducted by the Australia and New Zealand Multiple Sclerosis Genetics Consortium (ANZgene) revealed several SNPs within the MERTK gene (Chromosome 2q14.1, Accession Number NG_011607.1) that showed suggestive association with MS. We therefore interrogated 28 SNPs in MERTK in an independent replication cohort of 1140 MS cases and 1140 healthy controls. We found 12 SNPs that replicated, with 7 SNPs showing p-values of less than 10-5 when the discovery and replication cohorts were combined. All 12 replicated SNPs were in strong linkage disequilibrium with each other. In combination, these data suggest the MERTK gene is a novel risk gene for MS susceptibility. © 2011 Ma et al.
Resumo:
Though difficult, the study of gene-environment interactions in multifactorial diseases is crucial for interpreting the relevance of non-heritable factors and prevents from overlooking genetic associations with small but measurable effects. We propose a "candidate interactome" (i.e. a group of genes whose products are known to physically interact with environmental factors that may be relevant for disease pathogenesis) analysis of genome-wide association data in multiple sclerosis. We looked for statistical enrichment of associations among interactomes that, at the current state of knowledge, may be representative of gene-environment interactions of potential, uncertain or unlikely relevance for multiple sclerosis pathogenesis: Epstein-Barr virus, human immunodeficiency virus, hepatitis B virus, hepatitis C virus, cytomegalovirus, HHV8-Kaposi sarcoma, H1N1-influenza, JC virus, human innate immunity interactome for type I interferon, autoimmune regulator, vitamin D receptor, aryl hydrocarbon receptor and a panel of proteins targeted by 70 innate immune-modulating viral open reading frames from 30 viral species. Interactomes were either obtained from the literature or were manually curated. The P values of all single nucleotide polymorphism mapping to a given interactome were obtained from the last genome-wide association study of the International Multiple Sclerosis Genetics Consortium & the Wellcome Trust Case Control Consortium, 2. The interaction between genotype and Epstein Barr virus emerges as relevant for multiple sclerosis etiology. However, in line with recent data on the coexistence of common and unique strategies used by viruses to perturb the human molecular system, also other viruses have a similar potential, though probably less relevant in epidemiological terms. © 2013 Mechelli et al.
Resumo:
Objective. To identify genomic regions linked with determinants of age at symptom onset, disease activity, and functional impairment in ankylosing spondylitis (AS). Methods. A whole genome linkage scan was performed in 188 affected sibling pair families with 454 affected individuals. Traits assessed were age at symptom onset, disease activity assessed by the Bath Ankylosing Spondylitis Disease Activity Index (BASDAI), and functional impairment assessed by the Bath Ankylosing Spondylitis Functional Index (BASFI). Parametric and nonparametric quantitative linkage analysis was performed using parameters defined in a previous segregation study. Results. Heritabilities of the traits studied in this data set were as follows: BASDAI 0.49 (P = 0.0001, 95% confidence interval [95% CI] 0.23-0.75), BASFI 0.76 (P = 10-7, 95% CI 0.49-1.0), and age at symptom onset 0.33 (P = 0.005, 95% CI 0.04-0.62). No linkage was observed between the major histocompatibility complex (MHC) and any of the traits studied (logarithm of odds [LOD] score <1.0). "Significant" linkage (LOD score 4.0) was observed between a region on chromosome 18p and the BASDAI. Age at symptom onset showed "suggestive" linkage to chromosome 11p (LOD score 3.3). Maximum linkage with the BASFI was seen at chromosome 2q (LOD score 2.9). Conclusion. In contrast to the genetic determinants of susceptibility to AS, clinical manifestations of the disease measured by the BASDAI, BASFI, and age at symptom onset are largely determined by a small number of genes not encoded within the MHC.
Resumo:
Objective. To analyze the effect of HLA-DR genes on susceptibility to and severity of ankylosing spondylitis (AS). Methods. Three hundred sixty- three white British AS patients were studied; 149 were carefully assessed for a range of clinical manifestations, and disease severity was assessed using a structured questionnaire. Limited HLA class I typing and complete HLA-DR typing were performed using DNA-based methods. HLA data from 13,634 healthy white British bone marrow donors were used for comparison. Results. A significant association between DR1 and AS was found, independent of HLA-B27 (overall odds ratio [OR] 1.4, 95% confidence interval [95% CI] 1.1-1.8, P = 0.02; relative risk [RR] 2.7, 95% CI 1.5-4.8, P = 6 x 10-4 among homozygotes; RR 2.1, 95% CI 1.5-2.8, P = 5 x 10-6 among heterozygotes). A large but weakly significant association between DR8 and AS was noted, particularly among DR8 homozygotes (RR 6.8, 95% CI 1.6-29.2, P = 0.01 among homozygotes; RR 1.6, 95% CI 1.0-2.7, P = 0.07 among heterozygotes). A negative association with DR12 (OR 0.22, 95% CI 0.09-0.5, P = 0.001) was noted. HLA-DR7 was associated with younger age at onset of disease (mean age at onset 18 years for DR7-positive patients and 23 years for DR7-negative patients; Z score 3.21, P = 0.001). No other HLA class I or class H associations with disease severity or with different clinical manifestations of AS were found. Conclusion. The results of this study suggest that HLA-DR genes may have a weak effect on susceptibility to AS independent of HLA-B27, but do not support suggestions that they affect disease severity or different clinical manifestations.
Resumo:
Objective. To localize the regions containing genes that determine susceptibility to ankylosing spondylitis (AS). Methods. One hundred five white British families with 121 affected sibling pairs with AS were recruited, largely from the Royal National Hospital for Rheumatic Diseases AS database. A genome-wide linkage screen was undertaken using 254 highly polymorphic microsatellite markers from the Medical Research Council (UK) (MRC) set. The major histocompatibility complex (MHC) region was studied more intensively using 5 microsatellites lying within the HLA class III region and HLA-DRB1 typing. The Analyze package was used for 2-point analysis, and GeneHunter for multipoint analysis. Results. When only the MRC set was considered, 11 markers in 7 regions achieved a P value of ≤0.01. The maximum logarithm of odds score obtained was 3.8 (P = 1.4 x 10-5) using marker D6S273, which lies in the HLA class III region. A further marker used in mapping of the MHC class III region achieved a LOD score of 8.1 (P = 1 x 10-9). Nine of 118 affected sibling pairs (7.6%) did not share parental haplotypes identical by descent across the MHC, suggesting that only 31% of the susceptibility to AS is coded by genes linked to the MHC. The maximum non-MHC LOD score obtained was 2.6 (P = 0.0003) for marker D16S422. Conclusion. The results of this study confirm the strong linkage of the MHC with AS, and provide suggestive evidence regarding the presence and location of non-MHC genes influencing susceptibility to the disease.
Resumo:
Objective. To examine whether the T cell receptor (TCR) A or TCRB loci exhibit linkage with disease in multiplex rheumatoid arthritis (RA) families. Methods. A linkage study was performed in 184 RA families from the UK Arthritis and Rheumatism Council Repository, each containing at least 1 affected sibpair. The microsatellites D14S50, TCRA, and D14S64 spanning the TCRA locus and D7S509, Vβ6.7, and D7S688 spanning the TCRB locus were used as DNA markers. The subjects were genotyped using a semiautomated polymerase chain reaction-based method. Two-point and multipoint linkage analyses were performed. Results. Nonparametric single-marker likelihood odds (LOD) scores were 0.49 (P = 0.07) for D14S50, 0.65 (P = 0.04) for TCRA, 0.07 (P = 0.29) for D14S64, 0.01 (P = 0.43) for D7S509, 0.0 (P = 0.50) for Vβ6.7, and 0.0 (P = 0.50) for D7S688. By multipoint analysis, there was no evidence of linkage at TCRB (LOD score 0), and the maximum LOD score at the TCRA locus was 0.37 (at D14S50). The presence of a susceptibility locus (LOD score < -2.0) was excluded, with lambda ≤ 1.8 at TCRA and ≤1.4 at TCRB. Conclusion. These linkage studies provide no significant evidence of a major germline-encoded TCRA or TCRB component of susceptibility to RA.