953 resultados para open data capabilities


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Next-generation sequencing (NGS) technology has become a prominent tool in biological and biomedical research. However, NGS data analysis, such as de novo assembly, mapping and variants detection is far from maturity, and the high sequencing error-rate is one of the major problems. . To minimize the impact of sequencing errors, we developed a highly robust and efficient method, MTM, to correct the errors in NGS reads. We demonstrated the effectiveness of MTM on both single-cell data with highly non-uniform coverage and normal data with uniformly high coverage, reflecting that MTM’s performance does not rely on the coverage of the sequencing reads. MTM was also compared with Hammer and Quake, the best methods for correcting non-uniform and uniform data respectively. For non-uniform data, MTM outperformed both Hammer and Quake. For uniform data, MTM showed better performance than Quake and comparable results to Hammer. By making better error correction with MTM, the quality of downstream analysis, such as mapping and SNP detection, was improved. SNP calling is a major application of NGS technologies. However, the existence of sequencing errors complicates this process, especially for the low coverage (

Relevância:

30.00% 30.00%

Publicador:

Resumo:

These three manuscripts are presented as a PhD dissertation for the study of using GeoVis application to evaluate telehealth programs. The primary reason of this research was to understand how the GeoVis applications can be designed and developed using combined approaches of HC approach and cognitive fit theory and in terms utilized to evaluate telehealth program in Brazil. First manuscript The first manuscript in this dissertation presented a background about the use of GeoVisualization to facilitate visual exploration of public health data. The manuscript covered the existing challenges that were associated with an adoption of existing GeoVis applications. The manuscript combines the principles of Human Centered approach and Cognitive Fit Theory and a framework using a combination of these approaches is developed that lays the foundation of this research. The framework is then utilized to propose the design, development and evaluation of “the SanaViz” to evaluate telehealth data in Brazil, as a proof of concept. Second manuscript The second manuscript is a methods paper that describes the approaches that can be employed to design and develop “the SanaViz” based on the proposed framework. By defining the various elements of the HC approach and CFT, a mixed methods approach is utilized for the card sorting and sketching techniques. A representative sample of 20 study participants currently involved in the telehealth program at the NUTES telehealth center at UFPE, Recife, Brazil was enrolled. The findings of this manuscript helped us understand the needs of the diverse group of telehealth users, the tasks that they perform and helped us determine the essential features that might be necessary to be included in the proposed GeoVis application “the SanaViz”. Third manuscript The third manuscript involved mix- methods approach to compare the effectiveness and usefulness of the HC GeoVis application “the SanaViz” against a conventional GeoVis application “Instant Atlas”. The same group of 20 study participants who had earlier participated during Aim 2 was enrolled and a combination of quantitative and qualitative assessments was done. Effectiveness was gauged by the time that the participants took to complete the tasks using both the GeoVis applications, the ease with which they completed the tasks and the number of attempts that were taken to complete each task. Usefulness was assessed by System Usability Scale (SUS), a validated questionnaire tested in prior studies. In-depth interviews were conducted to gather opinions about both the GeoVis applications. This manuscript helped us in the demonstration of the usefulness and effectiveness of HC GeoVis applications to facilitate visual exploration of telehealth data, as a proof of concept. Together, these three manuscripts represent challenges of combining principles of Human Centered approach, Cognitive Fit Theory to design and develop GeoVis applications as a method to evaluate Telehealth data. To our knowledge, this is the first study to explore the usefulness and effectiveness of GeoVis to facilitate visual exploration of telehealth data. The results of the research enabled us to develop a framework for the design and development of GeoVis applications related to the areas of public health and especially telehealth. The results of our study showed that the varied users were involved with the telehealth program and the tasks that they performed. Further it enabled us to identify the components that might be essential to be included in these GeoVis applications. The results of our research answered the following questions; (a) Telehealth users vary in their level of understanding about GeoVis (b) Interaction features such as zooming, sorting, and linking and multiple views and representation features such as bar chart and choropleth maps were considered the most essential features of the GeoVis applications. (c) Comparing and sorting were two important tasks that the telehealth users would perform for exploratory data analysis. (d) A HC GeoVis prototype application is more effective and useful for exploration of telehealth data than a conventional GeoVis application. Future studies should be done to incorporate the proposed HC GeoVis framework to enable comprehensive assessment of the users and the tasks they perform to identify the features that might be necessary to be a part of the GeoVis applications. The results of this study demonstrate a novel approach to comprehensively and systematically enhance the evaluation of telehealth programs using the proposed GeoVis Framework.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sediments of Lake Donggi Cona on the northeastern Tibetan Plateau were studied to infer changes in the lacustrine depositional environment, related to climatic and non-climatic changes during the last 19 kyr. The lake today fills a 30 X 8 km big and 95 m deep tectonic basin, associated with the Kunlun Fault. The study was conducted on a sediment-core transect through the lake basin, in order to gain a complete picture of spatiotemporal environmental change. The recovered sediments are partly finely laminated and are composed of calcareous muds with variable amounts of carbonate micrite, organic matter, detrital silt and clay. On the basis of sedimentological, geochemical, and mineralogical data up to five lithological units (LU) can be distinguished that document distinct stages in the development of the lake system. The onset of the lowermost LU with lacustrine muds above basal sands indicates that lake level was at least 39 m below the present level and started to rise after 19 ka, possibly in response to regional deglaciation. At this time, the lacustrine environment was characterized by detrital sediment influx and the deposition of siliciclastic sediment. In two sediment cores, upward grain-size coarsening documents a lake-level fall after 13 cal ka BP, possibly associated with the late-glacial Younger Dryas stadial. From 11.5 to 4.3 cal ka BP, grainsize fining in sediment cores from the profundal coring sites and the onset of lacustrine deposition at a litoral core site (2m water depth) in a recent marginal bay of Donggi Cona document lake-level rise during the early tomid-Holocene to at least modern level. In addition, high biological productivity and pronounced precipitation of carbonate micrites are consistent with warm and moist climate conditions related to an enhanced influence of summer monsoon. At 4.3 cal ka BP the lake system shifted from an aragonite- to a calcite-dominated system, indicating a change towards a fully open hydrological lake system. The younger clay-rich sediments are moreover non-laminated and lack any diagenetic sulphides, pointing to fully ventilated conditions, and the prevailing absence of lake stratification. This turning point in lake history could imply either a threshold response to insolation-forced climate cooling or a response to a non-climatic trigger, such as an erosional event or a tectonic pulse that induced a strong earthquake, which is difficult to decide from our data base.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The reduction in sea ice along the SE Greenland coast during the last century has severely impacted ice-rafting to this area. In order to reconstruct ice-rafting and oceanographic conditions in the area of Denmark Strait during the last ~150 years, we conducted a multiproxy study on three short (20 cm) sediment cores from outer Kangerdlugssuaq Trough (~300 m water depth). The proxy-based data obtained have been compared with historical and instrumental data to gain a better understanding of the ice sheet-ocean interactions in the area. A robust chronology has been developed based on 210Pb and 137Cs measurements on core PO175GKC#9 (~66.2°N, 32°W) and expanded to the two adjacent cores based on correlations between calcite weight percent records. Our proxy records include sea-ice and phytoplankton biomarkers, and a variety of mineralogical determinations based on the <2 mm sediment fraction, including identification with quantitative x-ray diffraction, ice-rafted debris counts on the 63-150 µm sand fraction, and source identifications based on the composition of Fe oxides in the 45-250 µm fraction. A multivariate statistical analysis indicated significant correlations between our proxy records and historical data, especially with the mean annual temperature data from Stykkishólmur (Iceland) and the storis index (historical observations of sea-ice export via the East Greenland Current). In particular, the biological proxies (calcite weight percent, IP25, and total organic carbon %) showed significant linkage with the storis index. Our records show two distinct intervals in the recent history of the SE Greenland coast. The first of these (ad 1850-1910) shows predominantly perennial sea-ice conditions in the area, while the second (ad 1910-1990) shows more seasonally open water conditions.