20 resultados para error rates
em Duke University
Resumo:
Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.
Resumo:
BACKGROUND: Historically, only partial assessments of data quality have been performed in clinical trials, for which the most common method of measuring database error rates has been to compare the case report form (CRF) to database entries and count discrepancies. Importantly, errors arising from medical record abstraction and transcription are rarely evaluated as part of such quality assessments. Electronic Data Capture (EDC) technology has had a further impact, as paper CRFs typically leveraged for quality measurement are not used in EDC processes. METHODS AND PRINCIPAL FINDINGS: The National Institute on Drug Abuse Treatment Clinical Trials Network has developed, implemented, and evaluated methodology for holistically assessing data quality on EDC trials. We characterize the average source-to-database error rate (14.3 errors per 10,000 fields) for the first year of use of the new evaluation method. This error rate was significantly lower than the average of published error rates for source-to-database audits, and was similar to CRF-to-database error rates reported in the published literature. We attribute this largely to an absence of medical record abstraction on the trials we examined, and to an outpatient setting characterized by less acute patient conditions. CONCLUSIONS: Historically, medical record abstraction is the most significant source of error by an order of magnitude, and should be measured and managed during the course of clinical trials. Source-to-database error rates are highly dependent on the amount of structured data collection in the clinical setting and on the complexity of the medical record, dependencies that should be considered when developing data quality benchmarks.
Resumo:
Complex diseases will have multiple functional sites, and it will be invaluable to understand the cross-locus interaction in terms of linkage disequilibrium (LD) between those sites (epistasis) in addition to the haplotype-LD effects. We investigated the statistical properties of a class of matrix-based statistics to assess this epistasis. These statistical methods include two LD contrast tests (Zaykin et al., 2006) and partial least squares regression (Wang et al., 2008). To estimate Type 1 error rates and power, we simulated multiple two-variant disease models using the SIMLA software package. SIMLA allows for the joint action of up to two disease genes in the simulated data with all possible multiplicative interaction effects between them. Our goal was to detect an interaction between multiple disease-causing variants by means of their linkage disequilibrium (LD) patterns with other markers. We measured the effects of marginal disease effect size, haplotype LD, disease prevalence and minor allele frequency have on cross-locus interaction (epistasis). In the setting of strong allele effects and strong interaction, the correlation between the two disease genes was weak (r=0.2). In a complex system with multiple correlations (both marginal and interaction), it was difficult to determine the source of a significant result. Despite these complications, the partial least squares and modified LD contrast methods maintained adequate power to detect the epistatic effects; however, for many of the analyses we often could not separate interaction from a strong marginal effect. While we did not exhaust the entire parameter space of possible models, we do provide guidance on the effects that population parameters have on cross-locus interaction.
Resumo:
Human use of the oceans is increasingly in conflict with conservation of endangered species. Methods for managing the spatial and temporal placement of industries such as military, fishing, transportation and offshore energy, have historically been post hoc; i.e. the time and place of human activity is often already determined before assessment of environmental impacts. In this dissertation, I build robust species distribution models in two case study areas, US Atlantic (Best et al. 2012) and British Columbia (Best et al. 2015), predicting presence and abundance respectively, from scientific surveys. These models are then applied to novel decision frameworks for preemptively suggesting optimal placement of human activities in space and time to minimize ecological impacts: siting for offshore wind energy development, and routing ships to minimize risk of striking whales. Both decision frameworks relate the tradeoff between conservation risk and industry profit with synchronized variable and map views as online spatial decision support systems.
For siting offshore wind energy development (OWED) in the U.S. Atlantic (chapter 4), bird density maps are combined across species with weights of OWED sensitivity to collision and displacement and 10 km2 sites are compared against OWED profitability based on average annual wind speed at 90m hub heights and distance to transmission grid. A spatial decision support system enables toggling between the map and tradeoff plot views by site. A selected site can be inspected for sensitivity to a cetaceans throughout the year, so as to capture months of the year which minimize episodic impacts of pre-operational activities such as seismic airgun surveying and pile driving.
Routing ships to avoid whale strikes (chapter 5) can be similarly viewed as a tradeoff, but is a different problem spatially. A cumulative cost surface is generated from density surface maps and conservation status of cetaceans, before applying as a resistance surface to calculate least-cost routes between start and end locations, i.e. ports and entrance locations to study areas. Varying a multiplier to the cost surface enables calculation of multiple routes with different costs to conservation of cetaceans versus cost to transportation industry, measured as distance. Similar to the siting chapter, a spatial decisions support system enables toggling between the map and tradeoff plot view of proposed routes. The user can also input arbitrary start and end locations to calculate the tradeoff on the fly.
Essential to the input of these decision frameworks are distributions of the species. The two preceding chapters comprise species distribution models from two case study areas, U.S. Atlantic (chapter 2) and British Columbia (chapter 3), predicting presence and density, respectively. Although density is preferred to estimate potential biological removal, per Marine Mammal Protection Act requirements in the U.S., all the necessary parameters, especially distance and angle of observation, are less readily available across publicly mined datasets.
In the case of predicting cetacean presence in the U.S. Atlantic (chapter 2), I extracted datasets from the online OBIS-SEAMAP geo-database, and integrated scientific surveys conducted by ship (n=36) and aircraft (n=16), weighting a Generalized Additive Model by minutes surveyed within space-time grid cells to harmonize effort between the two survey platforms. For each of 16 cetacean species guilds, I predicted the probability of occurrence from static environmental variables (water depth, distance to shore, distance to continental shelf break) and time-varying conditions (monthly sea-surface temperature). To generate maps of presence vs. absence, Receiver Operator Characteristic (ROC) curves were used to define the optimal threshold that minimizes false positive and false negative error rates. I integrated model outputs, including tables (species in guilds, input surveys) and plots (fit of environmental variables, ROC curve), into an online spatial decision support system, allowing for easy navigation of models by taxon, region, season, and data provider.
For predicting cetacean density within the inner waters of British Columbia (chapter 3), I calculated density from systematic, line-transect marine mammal surveys over multiple years and seasons (summer 2004, 2005, 2008, and spring/autumn 2007) conducted by Raincoast Conservation Foundation. Abundance estimates were calculated using two different methods: Conventional Distance Sampling (CDS) and Density Surface Modelling (DSM). CDS generates a single density estimate for each stratum, whereas DSM explicitly models spatial variation and offers potential for greater precision by incorporating environmental predictors. Although DSM yields a more relevant product for the purposes of marine spatial planning, CDS has proven to be useful in cases where there are fewer observations available for seasonal and inter-annual comparison, particularly for the scarcely observed elephant seal. Abundance estimates are provided on a stratum-specific basis. Steller sea lions and harbour seals are further differentiated by ‘hauled out’ and ‘in water’. This analysis updates previous estimates (Williams & Thomas 2007) by including additional years of effort, providing greater spatial precision with the DSM method over CDS, novel reporting for spring and autumn seasons (rather than summer alone), and providing new abundance estimates for Steller sea lion and northern elephant seal. In addition to providing a baseline of marine mammal abundance and distribution, against which future changes can be compared, this information offers the opportunity to assess the risks posed to marine mammals by existing and emerging threats, such as fisheries bycatch, ship strikes, and increased oil spill and ocean noise issues associated with increases of container ship and oil tanker traffic in British Columbia’s continental shelf waters.
Starting with marine animal observations at specific coordinates and times, I combine these data with environmental data, often satellite derived, to produce seascape predictions generalizable in space and time. These habitat-based models enable prediction of encounter rates and, in the case of density surface models, abundance that can then be applied to management scenarios. Specific human activities, OWED and shipping, are then compared within a tradeoff decision support framework, enabling interchangeable map and tradeoff plot views. These products make complex processes transparent for gaming conservation, industry and stakeholders towards optimal marine spatial management, fundamental to the tenets of marine spatial planning, ecosystem-based management and dynamic ocean management.
Resumo:
Atomic ions trapped in micro-fabricated surface traps can be utilized as a physical platform with which to build a quantum computer. They possess many of the desirable qualities of such a device, including high fidelity state preparation and readout, universal logic gates, long coherence times, and can be readily entangled with each other through photonic interconnects. The use of optical cavities integrated with trapped ion qubits as a photonic interface presents the possibility for order of magnitude improvements in performance in several key areas of their use in quantum computation. The first part of this thesis describes the design and fabrication of a novel surface trap for integration with an optical cavity. The trap is custom made on a highly reflective mirror surface and includes the capability of moving the ion trap location along all three trap axes with nanometer scale precision. The second part of this thesis demonstrates the suitability of small micro-cavities formed from laser ablated fused silica substrates with radii of curvature in the 300-500 micron range for use with the mirror trap as part of an integrated ion trap cavity system. Quantum computing applications for such a system include dramatic improvements in the photonic entanglement rate up to 10 kHz, the qubit measurement time down to 1 microsecond, and the measurement error rates down to the 10e-5 range. The final part of this thesis details a performance simulator for exploring the physical resource requirements and performance demands to scale such a quantum computer to sizes capable of performing quantum algorithms beyond the limits of classical computation.
Resumo:
This paper considers forecasting the conditional mean and variance from a single-equation dynamic model with autocorrelated disturbances following an ARMA process, and innovations with time-dependent conditional heteroskedasticity as represented by a linear GARCH process. Expressions for the minimum MSE predictor and the conditional MSE are presented. We also derive the formula for all the theoretical moments of the prediction error distribution from a general dynamic model with GARCH(1, 1) innovations. These results are then used in the construction of ex ante prediction confidence intervals by means of the Cornish-Fisher asymptotic expansion. An empirical example relating to the uncertainty of the expected depreciation of foreign exchange rates illustrates the usefulness of the results. © 1992.
Resumo:
We obtain an upper bound on the time available for quantum computation for a given quantum computer and decohering environment with quantum error correction implemented. First, we derive an explicit quantum evolution operator for the logical qubits and show that it has the same form as that for the physical qubits but with a reduced coupling strength to the environment. Using this evolution operator, we find the trace distance between the real and ideal states of the logical qubits in two cases. For a super-Ohmic bath, the trace distance saturates, while for Ohmic or sub-Ohmic baths, there is a finite time before the trace distance exceeds a value set by the user. © 2010 The American Physical Society.
Resumo:
The ground state structure of C(4N+2) rings is believed to exhibit a geometric transition from angle alternation (N < or = 2) to bond alternation (N > 2). All previous density functional theory (DFT) studies on these molecules have failed to reproduce this behavior by predicting either that the transition occurs at too large a ring size, or that the transition leads to a higher symmetry cumulene. Employing the recently proposed perspective of delocalization error within DFT we rationalize this failure of common density functional approximations (DFAs) and present calculations with the rCAM-B3LYP exchange-correlation functional that show an angle-to-bond-alternation transition between C(10) and C(14). The behavior exemplified here manifests itself more generally as the well known tendency of DFAs to bias toward delocalized electron distributions as favored by Huckel aromaticity, of which the C(4N+2) rings provide a quintessential example. Additional examples are the relative energies of the C(20) bowl, cage, and ring isomers; we show that the results from functionals with minimal delocalization error are in good agreement with CCSD(T) results, in contrast to other commonly used DFAs. An unbiased DFT treatment of electron delocalization is a key for reliable prediction of relative stability and hence the structures of complex molecules where many structure stabilization mechanisms exist.
Resumo:
BACKGROUND: Most information about the lifetime prevalence of mental disorders comes from retrospective surveys, but how much these surveys have undercounted due to recall failure is unknown. We compared results from a prospective study with those from retrospective studies. METHOD: The representative 1972-1973 Dunedin New Zealand birth cohort (n=1037) was followed to age 32 years with 96% retention, and compared to the national New Zealand Mental Health Survey (NZMHS) and two US National Comorbidity Surveys (NCS and NCS-R). Measures were research diagnoses of anxiety, depression, alcohol dependence and cannabis dependence from ages 18 to 32 years. RESULTS: The prevalence of lifetime disorder to age 32 was approximately doubled in prospective as compared to retrospective data for all four disorder types. Moreover, across disorders, prospective measurement yielded a mean past-year-to-lifetime ratio of 38% whereas retrospective measurement yielded higher mean past-year-to-lifetime ratios of 57% (NZMHS, NCS-R) and 65% (NCS). CONCLUSIONS: Prospective longitudinal studies complement retrospective surveys by providing unique information about lifetime prevalence. The experience of at least one episode of DSM-defined disorder during a lifetime may be far more common in the population than previously thought. Research should ask what this means for etiological theory, construct validity of the DSM approach, public perception of stigma, estimates of the burden of disease and public health policy.
Resumo:
The activation parameters and the rate constants of the water-exchange reactions of Mn(III)TE-2-PyP(5+) (meso-tetrakis(N-ethylpyridinium-2-yl)porphyrin) as cationic, Mn(III)TnHex-2-PyP(5+) (meso-tetrakis(N-n-hexylpyridinium-2-yl)porphyrin) as sterically shielded cationic, and Mn(III)TSPP(3-) (meso-tetrakis(4-sulfonatophenyl)porphyrin) as anionic manganese(iii) porphyrins were determined from the temperature dependence of (17)O NMR relaxation rates. The rate constants at 298 K were obtained as 4.12 x 10(6) s(-1), 5.73 x 10(6) s(-1), and 2.74 x 10(7) s(-1), respectively. On the basis of the determined entropies of activation, an interchange-dissociative mechanism (I(d)) was proposed for the cationic complexes (DeltaS(double dagger) = approximately 0 J mol(-1) K(-1)) whereas a limiting dissociative mechanism (D) was proposed for Mn(III)TSPP(3-) complex (DeltaS(double dagger) = +79 J mol(-1) K(-1)). The obtained water exchange rate of Mn(III)TSPP(3-) corresponded well to the previously assumed value used by Koenig et al. (S. H. Koenig, R. D. Brown and M. Spiller, Magn. Reson. Med., 1987, 4, 52-260) to simulate the (1)H NMRD curves, therefore the measured value supports the theory developed for explaining the anomalous relaxivity of Mn(III)TSPP(3-) complex. A magnitude of the obtained water-exchange rate constants further confirms the suggested inner sphere electron transfer mechanism for the reactions of the two positively charged Mn(iii) porphyrins with the various biologically important oxygen and nitrogen reactive species. Due to the high biological and clinical relevance of the reactions that occur at the metal site of the studied Mn(iii) porphyrins, the determination of water exchange rates advanced our insight into their efficacy and mechanism of action, and in turn should impact their further development for both diagnostic (imaging) and therapeutic purposes.
Resumo:
We implemented a hospital-based influenza vaccination program for household contacts of newborns. Among mothers not vaccinated prenatally, 44.7% were vaccinated through the program, as were 25.7% of fathers. A hospital-based program provided opportunities for vaccination of household contacts of newborns, thereby facilitating better adherence to national vaccination guidelines.
Resumo:
In a stochastic environment, long-term fitness can be influenced by variation, covariation, and serial correlation in vital rates (survival and fertility). Yet no study of an animal population has parsed the contributions of these three aspects of variability to long-term fitness. We do so using a unique database that includes complete life-history information for wild-living individuals of seven primate species that have been the subjects of long-term (22-45 years) behavioral studies. Overall, the estimated levels of vital rate variation had only minor effects on long-term fitness, and the effects of vital rate covariation and serial correlation were even weaker. To explore why, we compared estimated variances of adult survival in primates with values for other vertebrates in the literature and found that adult survival is significantly less variable in primates than it is in the other vertebrates. Finally, we tested the prediction that adult survival, because it more strongly influences fitness in a constant environment, will be less variable than newborn survival, and we found only mixed support for the prediction. Our results suggest that wild primates may be buffered against detrimental fitness effects of environmental stochasticity by their highly developed cognitive abilities, social networks, and broad, flexible diets.
Resumo:
We show that "commodity currency" exchange rates have surprisingly robust power in predicting global commodity prices, both in-sample and out-of-sample, and against a variety of alternative benchmarks. This result is of particular interest to policy makers, given the lack of deep forward markets in many individual commodities, and broad aggregate commodity indices in particular. We also explore the reverse relationship (commodity prices forecasting exchange rates) but find it to be notably less robust. We offer a theoretical resolution, based on the fact that exchange rates are strongly forward-looking, whereas commodity price fluctuations are typically more sensitive to short-term demand imbalances. © 2010 by the President and Fellows of Harvard College and the Massachusetts Institute of Technology.
Resumo:
Droplet-based digital microfluidics technology has now come of age, and software-controlled biochips for healthcare applications are starting to emerge. However, today's digital microfluidic biochips suffer from the drawback that there is no feedback to the control software from the underlying hardware platform. Due to the lack of precision inherent in biochemical experiments, errors are likely during droplet manipulation; error recovery based on the repetition of experiments leads to wastage of expensive reagents and hard-to-prepare samples. By exploiting recent advances in the integration of optical detectors (sensors) into a digital microfluidics biochip, we present a physical-aware system reconfiguration technique that uses sensor data at intermediate checkpoints to dynamically reconfigure the biochip. A cyberphysical resynthesis technique is used to recompute electrode-actuation sequences, thereby deriving new schedules, module placement, and droplet routing pathways, with minimum impact on the time-to-response. © 2012 IEEE.
Resumo:
Evaluating environmental policies, such as the mitigation of greenhouse gases, frequently requires balancing near-term mitigation costs against long-term environmental benefits. Conventional approaches to valuing such investments hold interest rates constant, but the authors contend that there is a real degree of uncertainty in future interest rates. This leads to a higher valuation of future benefits relative to conventional methods that ignore interest rate uncertainty.