42 resultados para NewSQL databases
em Duke University
Resumo:
During the summer of 2016, Duke University Libraries staff began a project to update the way that research databases are displayed on the library website. The new research databases page is a customized version of the default A-Z list that Springshare provides for its LibGuides content management system. Duke Libraries staff made adjustments to the content and interface of the page. In order to see how Duke users navigated the new interface, usability testing was conducted on August 9th, 2016.
Resumo:
As more diagnostic testing options become available to physicians, it becomes more difficult to combine various types of medical information together in order to optimize the overall diagnosis. To improve diagnostic performance, here we introduce an approach to optimize a decision-fusion technique to combine heterogeneous information, such as from different modalities, feature categories, or institutions. For classifier comparison we used two performance metrics: The receiving operator characteristic (ROC) area under the curve [area under the ROC curve (AUC)] and the normalized partial area under the curve (pAUC). This study used four classifiers: Linear discriminant analysis (LDA), artificial neural network (ANN), and two variants of our decision-fusion technique, AUC-optimized (DF-A) and pAUC-optimized (DF-P) decision fusion. We applied each of these classifiers with 100-fold cross-validation to two heterogeneous breast cancer data sets: One of mass lesion features and a much more challenging one of microcalcification lesion features. For the calcification data set, DF-A outperformed the other classifiers in terms of AUC (p < 0.02) and achieved AUC=0.85 +/- 0.01. The DF-P surpassed the other classifiers in terms of pAUC (p < 0.01) and reached pAUC=0.38 +/- 0.02. For the mass data set, DF-A outperformed both the ANN and the LDA (p < 0.04) and achieved AUC=0.94 +/- 0.01. Although for this data set there were no statistically significant differences among the classifiers' pAUC values (pAUC=0.57 +/- 0.07 to 0.67 +/- 0.05, p > 0.10), the DF-P did significantly improve specificity versus the LDA at both 98% and 100% sensitivity (p < 0.04). In conclusion, decision fusion directly optimized clinically significant performance measures, such as AUC and pAUC, and sometimes outperformed two well-known machine-learning techniques when applied to two different breast cancer data sets.
Not published, not indexed: issues in generating and finding hospice and palliative care literature.
Resumo:
INTRODUCTION: Accessing new knowledge as the evidence base for hospice and palliative care grows has specific challenges for the discipline. This study aimed to describe conversion rates of palliative and hospice care conference abstracts to journal articles and to highlight that some palliative care literature may not be retrievable because it is not indexed on bibliographic databases. METHODS: Substudy A tracked the journal publication of conference abstracts selected for inclusion in a gray literature database on www.caresearch.com.au . Abstracts were included in the gray literature database following handsearching of proceedings of over 100 Australian conferences likely to have some hospice or palliative care content that were held between 1980 and 1999. Substudy B looked at indexing from first publication until 2001 of three international hospice and palliative care journals in four widely available bibliographic databases through systematic tracing of all original papers in the journals. RESULTS: Substudy A showed that for the 1338 abstracts identified only 15.9% were published (compared to an average in health of 45%). Published abstracts were found in 78 different journals. Multiauthor abstracts and oral presentations had higher rates of conversion. Substudy B demonstrated lag time between first publication and bibliographic indexing. Even after listing, idiosyncratic noninclusions were identified. DISCUSSION: There are limitations to retrieval of all possible literature through electronic searching of bibliographic databases. Encouraging publication in indexed journals of studies presented at conferences, promoting selection of palliative care journals for database indexing, and searching more than one bibliographic database will improve the accessibility of existing and new knowledge in hospice and palliative care.
Resumo:
BACKGROUND: Outpatient palliative care, an evolving delivery model, seeks to improve continuity of care across settings and to increase access to services in hospice and palliative medicine (HPM). It can provide a critical bridge between inpatient palliative care and hospice, filling the gap in community-based supportive care for patients with advanced life-limiting illness. Low capacities for data collection and quantitative research in HPM have impeded assessment of the impact of outpatient palliative care. APPROACH: In North Carolina, a regional database for community-based palliative care has been created through a unique partnership between a HPM organization and academic medical center. This database flexibly uses information technology to collect patient data, entered at the point of care (e.g., home, inpatient hospice, assisted living facility, nursing home). HPM physicians and nurse practitioners collect data; data are transferred to an academic site that assists with analyses and data management. Reports to community-based sites, based on data they provide, create a better understanding of local care quality. CURRENT STATUS: The data system was developed and implemented over a 2-year period, starting with one community-based HPM site and expanding to four. Data collection methods were collaboratively created and refined. The database continues to grow. Analyses presented herein examine data from one site and encompass 2572 visits from 970 new patients, characterizing the population, symptom profiles, and change in symptoms after intervention. CONCLUSION: A collaborative regional approach to HPM data can support evaluation and improvement of palliative care quality at the local, aggregated, and statewide levels.
Resumo:
In a stochastic environment, long-term fitness can be influenced by variation, covariation, and serial correlation in vital rates (survival and fertility). Yet no study of an animal population has parsed the contributions of these three aspects of variability to long-term fitness. We do so using a unique database that includes complete life-history information for wild-living individuals of seven primate species that have been the subjects of long-term (22-45 years) behavioral studies. Overall, the estimated levels of vital rate variation had only minor effects on long-term fitness, and the effects of vital rate covariation and serial correlation were even weaker. To explore why, we compared estimated variances of adult survival in primates with values for other vertebrates in the literature and found that adult survival is significantly less variable in primates than it is in the other vertebrates. Finally, we tested the prediction that adult survival, because it more strongly influences fitness in a constant environment, will be less variable than newborn survival, and we found only mixed support for the prediction. Our results suggest that wild primates may be buffered against detrimental fitness effects of environmental stochasticity by their highly developed cognitive abilities, social networks, and broad, flexible diets.
Resumo:
The affective impact of music arises from a variety of factors, including intensity, tempo, rhythm, and tonal relationships. The emotional coloring evoked by intensity, tempo, and rhythm appears to arise from association with the characteristics of human behavior in the corresponding condition; however, how and why particular tonal relationships in music convey distinct emotional effects are not clear. The hypothesis examined here is that major and minor tone collections elicit different affective reactions because their spectra are similar to the spectra of voiced speech uttered in different emotional states. To evaluate this possibility the spectra of the intervals that distinguish major and minor music were compared to the spectra of voiced segments in excited and subdued speech using fundamental frequency and frequency ratios as measures. Consistent with the hypothesis, the spectra of major intervals are more similar to spectra found in excited speech, whereas the spectra of particular minor intervals are more similar to the spectra of subdued speech. These results suggest that the characteristic affective impact of major and minor tone collections arises from associations routinely made between particular musical intervals and voiced speech.
Resumo:
BACKGROUND: Over the past two decades more than fifty thousand unique clinical and biological samples have been assayed using the Affymetrix HG-U133 and HG-U95 GeneChip microarray platforms. This substantial repository has been used extensively to characterize changes in gene expression between biological samples, but has not been previously mined en masse for changes in mRNA processing. We explored the possibility of using HG-U133 microarray data to identify changes in alternative mRNA processing in several available archival datasets. RESULTS: Data from these and other gene expression microarrays can now be mined for changes in transcript isoform abundance using a program described here, SplicerAV. Using in vivo and in vitro breast cancer microarray datasets, SplicerAV was able to perform both gene and isoform specific expression profiling within the same microarray dataset. Our reanalysis of Affymetrix U133 plus 2.0 data generated by in vitro over-expression of HRAS, E2F3, beta-catenin (CTNNB1), SRC, and MYC identified several hundred oncogene-induced mRNA isoform changes, one of which recognized a previously unknown mechanism of EGFR family activation. Using clinical data, SplicerAV predicted 241 isoform changes between low and high grade breast tumors; with changes enriched among genes coding for guanyl-nucleotide exchange factors, metalloprotease inhibitors, and mRNA processing factors. Isoform changes in 15 genes were associated with aggressive cancer across the three breast cancer datasets. CONCLUSIONS: Using SplicerAV, we identified several hundred previously uncharacterized isoform changes induced by in vitro oncogene over-expression and revealed a previously unknown mechanism of EGFR activation in human mammary epithelial cells. We analyzed Affymetrix GeneChip data from over 400 human breast tumors in three independent studies, making this the largest clinical dataset analyzed for en masse changes in alternative mRNA processing. The capacity to detect RNA isoform changes in archival microarray data using SplicerAV allowed us to carry out the first analysis of isoform specific mRNA changes directly associated with cancer survival.
Resumo:
BACKGROUND: Malignant glioma is a rare cancer with poor survival. The influence of diet and antioxidant intake on glioma survival is not well understood. The current study examines the association between antioxidant intake and survival after glioma diagnosis. METHODS: Adult patients diagnosed with malignant glioma during 1991-1994 and 1997-2001 were enrolled in a population-based study. Diagnosis was confirmed by review of pathology specimens. A modified food-frequency questionnaire interview was completed by each glioma patient or a designated proxy. Intake of each food item was converted to grams consumed/day. From this nutrient database, 16 antioxidants, calcium, a total antioxidant index and 3 macronutrients were available for survival analysis. Cox regression estimated mortality hazard ratios associated with each nutrient and the antioxidant index adjusting for potential confounders. Nutrient values were categorized into tertiles. Models were stratified by histology (Grades II, III, and IV) and conducted for all (including proxy) subjects and for a subset of self-reported subjects. RESULTS: Geometric mean values for 11 fat-soluble and 6 water-soluble individual antioxidants, antioxidant index and 3 macronutrients were virtually the same when comparing all cases (n=748) to self-reported cases only (n=450). For patients diagnosed with Grade II and Grade III histology, moderate (915.8-2118.3 mcg) intake of fat-soluble lycopene was associated with poorer survival when compared to low intake (0.0-914.8 mcg), for self-reported cases only. High intake of vitamin E and moderate/high intake of secoisolariciresinol among Grade III patients indicated greater survival for all cases. In Grade IV patients, moderate/high intake of cryptoxanthin and high intake of secoisolariciresinol were associated with poorer survival among all cases. Among Grade II patients, moderate intake of water-soluble folate was associated with greater survival for all cases; high intake of vitamin C and genistein and the highest level of the antioxidant index were associated with poorer survival for all cases. CONCLUSIONS: The associations observed in our study suggest that the influence of some antioxidants on survival following a diagnosis of malignant glioma are inconsistent and vary by histology group. Further research in a large sample of glioma patients is needed to confirm/refute our results.
Resumo:
BACKGROUND: One year after the introduction of Information and Communication Technology (ICT) to support diagnostic imaging at our hospital, clinicians had faster and better access to radiology reports and images; direct access to Computed Tomography (CT) reports in the Electronic Medical Record (EMR) was particularly popular. The objective of this study was to determine whether improvements in radiology reporting and clinical access to diagnostic imaging information one year after the ICT introduction were associated with a reduction in the length of patients' hospital stays (LOS). METHODS: Data describing hospital stays and diagnostic imaging were collected retrospectively from the EMR during periods of equal duration before and one year after the introduction of ICT. The post-ICT period was chosen because of the documented improvement in clinical access to radiology results during that period. The data set was randomly split into an exploratory part used to establish the hypotheses, and a confirmatory part. The data was used to compare the pre-ICT and post-ICT status, but also to compare differences between groups. RESULTS: There was no general reduction in LOS one year after ICT introduction. However, there was a 25% reduction for one group - patients with CT scans. This group was heterogeneous, covering 445 different primary discharge diagnoses. Analyses of subgroups were performed to reduce the impact of this divergence. CONCLUSION: Our results did not indicate that improved access to radiology results reduced the patients' LOS. There was, however, a significant reduction in LOS for patients undergoing CT scans. Given the clinicians' interest in CT reports and the results of the subgroup analyses, it is likely that improved access to CT reports contributed to this reduction.
Resumo:
BACKGROUND: There is considerable interest in the development of methods to efficiently identify all coding variants present in large sample sets of humans. There are three approaches possible: whole-genome sequencing, whole-exome sequencing using exon capture methods, and RNA-Seq. While whole-genome sequencing is the most complete, it remains sufficiently expensive that cost effective alternatives are important. RESULTS: Here we provide a systematic exploration of how well RNA-Seq can identify human coding variants by comparing variants identified through high coverage whole-genome sequencing to those identified by high coverage RNA-Seq in the same individual. This comparison allowed us to directly evaluate the sensitivity and specificity of RNA-Seq in identifying coding variants, and to evaluate how key parameters such as the degree of coverage and the expression levels of genes interact to influence performance. We find that although only 40% of exonic variants identified by whole genome sequencing were captured using RNA-Seq; this number rose to 81% when concentrating on genes known to be well-expressed in the source tissue. We also find that a high false positive rate can be problematic when working with RNA-Seq data, especially at higher levels of coverage. CONCLUSIONS: We conclude that as long as a tissue relevant to the trait under study is available and suitable quality control screens are implemented, RNA-Seq is a fast and inexpensive alternative approach for finding coding variants in genes with sufficiently high expression levels.
Resumo:
Tumor microenvironmental stresses, such as hypoxia and lactic acidosis, play important roles in tumor progression. Although gene signatures reflecting the influence of these stresses are powerful approaches to link expression with phenotypes, they do not fully reflect the complexity of human cancers. Here, we describe the use of latent factor models to further dissect the stress gene signatures in a breast cancer expression dataset. The genes in these latent factors are coordinately expressed in tumors and depict distinct, interacting components of the biological processes. The genes in several latent factors are highly enriched in chromosomal locations. When these factors are analyzed in independent datasets with gene expression and array CGH data, the expression values of these factors are highly correlated with copy number alterations (CNAs) of the corresponding BAC clones in both the cell lines and tumors. Therefore, variation in the expression of these pathway-associated factors is at least partially caused by variation in gene dosage and CNAs among breast cancers. We have also found the expression of two latent factors without any chromosomal enrichment is highly associated with 12q CNA, likely an instance of "trans"-variations in which CNA leads to the variations in gene expression outside of the CNA region. In addition, we have found that factor 26 (1q CNA) is negatively correlated with HIF-1alpha protein and hypoxia pathways in breast tumors and cell lines. This agrees with, and for the first time links, known good prognosis associated with both a low hypoxia signature and the presence of CNA in this region. Taken together, these results suggest the possibility that tumor segmental aneuploidy makes significant contributions to variation in the lactic acidosis/hypoxia gene signatures in human cancers and demonstrate that latent factor analysis is a powerful means to uncover such a linkage.
Resumo:
We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten "case" genomes from individuals with severe hemophilia A and ten "control" genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs) discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.
Resumo:
BACKGROUND: Dropouts and missing data are nearly-ubiquitous in obesity randomized controlled trails, threatening validity and generalizability of conclusions. Herein, we meta-analytically evaluate the extent of missing data, the frequency with which various analytic methods are employed to accommodate dropouts, and the performance of multiple statistical methods. METHODOLOGY/PRINCIPAL FINDINGS: We searched PubMed and Cochrane databases (2000-2006) for articles published in English and manually searched bibliographic references. Articles of pharmaceutical randomized controlled trials with weight loss or weight gain prevention as major endpoints were included. Two authors independently reviewed each publication for inclusion. 121 articles met the inclusion criteria. Two authors independently extracted treatment, sample size, drop-out rates, study duration, and statistical method used to handle missing data from all articles and resolved disagreements by consensus. In the meta-analysis, drop-out rates were substantial with the survival (non-dropout) rates being approximated by an exponential decay curve (e(-lambdat)) where lambda was estimated to be .0088 (95% bootstrap confidence interval: .0076 to .0100) and t represents time in weeks. The estimated drop-out rate at 1 year was 37%. Most studies used last observation carried forward as the primary analytic method to handle missing data. We also obtained 12 raw obesity randomized controlled trial datasets for empirical analyses. Analyses of raw randomized controlled trial data suggested that both mixed models and multiple imputation performed well, but that multiple imputation may be more robust when missing data are extensive. CONCLUSION/SIGNIFICANCE: Our analysis offers an equation for predictions of dropout rates useful for future study planning. Our raw data analyses suggests that multiple imputation is better than other methods for handling missing data in obesity randomized controlled trials, followed closely by mixed models. We suggest these methods supplant last observation carried forward as the primary method of analysis.
Resumo:
BACKGROUND: Sharing of epidemiological and clinical data sets among researchers is poor at best, in detriment of science and community at large. The purpose of this paper is therefore to (1) describe a novel Web application designed to share information on study data sets focusing on epidemiological clinical research in a collaborative environment and (2) create a policy model placing this collaborative environment into the current scientific social context. METHODOLOGY: The Database of Databases application was developed based on feedback from epidemiologists and clinical researchers requiring a Web-based platform that would allow for sharing of information about epidemiological and clinical study data sets in a collaborative environment. This platform should ensure that researchers can modify the information. A Model-based predictions of number of publications and funding resulting from combinations of different policy implementation strategies (for metadata and data sharing) were generated using System Dynamics modeling. PRINCIPAL FINDINGS: The application allows researchers to easily upload information about clinical study data sets, which is searchable and modifiable by other users in a wiki environment. All modifications are filtered by the database principal investigator in order to maintain quality control. The application has been extensively tested and currently contains 130 clinical study data sets from the United States, Australia, China and Singapore. Model results indicated that any policy implementation would be better than the current strategy, that metadata sharing is better than data-sharing, and that combined policies achieve the best results in terms of publications. CONCLUSIONS: Based on our empirical observations and resulting model, the social network environment surrounding the application can assist epidemiologists and clinical researchers contribute and search for metadata in a collaborative environment, thus potentially facilitating collaboration efforts among research communities distributed around the globe.
Resumo:
BACKGROUND: With the globalization of clinical trials, large developing nations have substantially increased their participation in multi-site studies. This participation has raised ethical concerns, among them the fear that local customs, habits and culture are not respected while asking potential participants to take part in study. This knowledge gap is particularly noticeable among Indian subjects, since despite the large number of participants, little is known regarding what factors affect their willingness to participate in clinical trials. METHODS: We conducted a meta-analysis of all studies evaluating the factors and barriers, from the perspective of potential Indian participants, contributing to their participation in clinical trials. We searched both international as well as Indian-specific bibliographic databases, including Pubmed, Cochrane, Openjgate, MedInd, Scirus and Medknow, also performing hand searches and communicating with authors to obtain additional references. We enrolled studies dealing exclusively with the participation of Indians in clinical trials. Data extraction was conducted by three researchers, with disagreement being resolved by consensus. RESULTS: Six qualitative studies and one survey were found evaluating the main themes affecting the participation of Indian subjects. Themes included Personal health benefits, Altruism, Trust in physicians, Source of extra income, Detailed knowledge, Methods for motivating participants as factors favoring, while Mistrust on trial organizations, Concerns about efficacy and safety of trials, Psychological reasons, Trial burden, Loss of confidentiality, Dependency issues, Language as the barriers. CONCLUSION: We identified factors that facilitated and barriers that have negative implications on trial participation decisions in Indian subjects. Due consideration and weightage should be assigned to these factors while planning future trials in India.