Biblioteca Digital

908 resultados para throughput

Data Quality Assessment of Ungated Flow Cytometry Data in High

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: The recent development of semi-automated techniques for staining and analyzing flow cytometry samples has presented new challenges. Quality control and quality assessment are critical when developing new high throughput technologies and their associated information services. Our experience suggests that significant bottlenecks remain in the development of high throughput flow cytometry methods for data analysis and display. Especially, data quality control and quality assessment are crucial steps in processing and analyzing high throughput flow cytometry data. Methods: We propose a variety of graphical exploratory data analytic tools for exploring ungated flow cytometry data. We have implemented a number of specialized functions and methods in the Bioconductor package rflowcyt. We demonstrate the use of these approaches by investigating two independent sets of high throughput flow cytometry data. Results: We found that graphical representations can reveal substantial non-biological differences in samples. Empirical Cumulative Distribution Function and summary scatterplots were especially useful in the rapid identification of problems not identified by manual review. Conclusions: Graphical exploratory data analytic tools are quick and useful means of assessing data quality. We propose that the described visualizations should be used as quality assessment tools and where possible, be used for quality control.

A graph theoretic approach to testing associations between disparate sources of functional genomic data

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The last few years have seen the advent of high-throughput technologies to analyze various properties of the transcriptome and proteome of several organisms. The congruency of these different data sources, or lack thereof, can shed light on the mechanisms that govern cellular function. A central challenge for bioinformatics research is to develop a unified framework for combining the multiple sources of functional genomics information and testing associations between them, thus obtaining a robust and integrated view of the underlying biology. We present a graph theoretic approach to test the significance of the association between multiple disparate sources of functional genomics data by proposing two statistical tests, namely edge permutation and node label permutation tests. We demonstrate the use of the proposed tests by finding significant association between a Gene Ontology-derived "predictome" and data obtained from mRNA expression and phenotypic experiments for Saccharomyces cerevisiae. Moreover, we employ the graph theoretic framework to recast a surprising discrepancy presented in Giaever et al. (2002) between gene expression and knockout phenotype, using expression data from a different set of experiments.

Classification Using Generalized Partial Least Squares

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The advances in computational biology have made simultaneous monitoring of thousands of features possible. The high throughput technologies not only bring about a much richer information context in which to study various aspects of gene functions but they also present challenge of analyzing data with large number of covariates and few samples. As an integral part of machine learning, classification of samples into two or more categories is almost always of interest to scientists. In this paper, we address the question of classification in this setting by extending partial least squares (PLS), a popular dimension reduction tool in chemometrics, in the context of generalized linear regression based on a previous approach, Iteratively ReWeighted Partial Least Squares, i.e. IRWPLS (Marx, 1996). We compare our results with two-stage PLS (Nguyen and Rocke, 2002A; Nguyen and Rocke, 2002B) and other classifiers. We show that by phrasing the problem in a generalized linear model setting and by applying bias correction to the likelihood to avoid (quasi)separation, we often get lower classification error rates.

A MULTILEVEL MODEL TO ADDRESS BATCH EFFECTS IN COPY NUMBER ESTIMATION USING SNP ARRAYS

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Submicroscopic changes in chromosomal DNA copy number dosage are common and have been implicated in many heritable diseases and cancers. Recent high-throughput technologies have a resolution that permits the detection of segmental changes in DNA copy number that span thousands of basepairs across the genome. Genome-wide association studies (GWAS) may simultaneously screen for copy number-phenotype and SNP-phenotype associations as part of the analytic strategy. However, genome-wide array analyses are particularly susceptible to batch effects as the logistics of preparing DNA and processing thousands of arrays often involves multiple laboratories and technicians, or changes over calendar time to the reagents and laboratory equipment. Failure to adjust for batch effects can lead to incorrect inference and requires inefficient post-hoc quality control procedures that exclude regions that are associated with batch. Our work extends previous model-based approaches for copy number estimation by explicitly modeling batch effects and using shrinkage to improve locus-specific estimates of copy number uncertainty. Key features of this approach include the use of diallelic genotype calls from experimental data to estimate batch- and locus-specific parameters of background and signal without the requirement of training data. We illustrate these ideas using a study of bipolar disease and a study of chromosome 21 trisomy. The former has batch effects that dominate much of the observed variation in quantile-normalized intensities, while the latter illustrates the robustness of our approach to datasets where as many as 25% of the samples have altered copy number. Locus-specific estimates of copy number can be plotted on the copy-number scale to investigate mosaicism and guide the choice of appropriate downstream approaches for smoothing the copy number as a function of physical position. The software is open source and implemented in the R package CRLMM available at Bioconductor (http:www.bioconductor.org).

EXPLORATION, NORMALIZATION, AND GENOTYPE CALLS OF HIGH DENSITY OLIGONUCLEOTIDE SNP ARRAY DATA

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In most microarray technologies, a number of critical steps are required to convert raw intensity measurements into the data relied upon by data analysts, biologists and clinicians. These data manipulations, referred to as preprocessing, can influence the quality of the ultimate measurements. In the last few years, the high-throughput measurement of gene expression is the most popular application of microarray technology. For this application, various groups have demonstrated that the use of modern statistical methodology can substantially improve accuracy and precision of gene expression measurements, relative to ad-hoc procedures introduced by designers and manufacturers of the technology. Currently, other applications of microarrays are becoming more and more popular. In this paper we describe a preprocessing methodology for a technology designed for the identification of DNA sequence variants in specific genes or regions of the human genome that are associated with phenotypes of interest such as disease. In particular we describe methodology useful for preprocessing Affymetrix SNP chips and obtaining genotype calls with the preprocessed data. We demonstrate how our procedure improves existing approaches using data from three relatively large studies including one in which large number independent calls are available. Software implementing these ideas are avialble from the Bioconductor oligo package.

A BAYESIAN HIERARCHICAL FRAMEWORK FOR SPATIAL MODELING OF fMRI DATA

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Functional neuroimaging techniques enable investigations into the neural basis of human cognition, emotions, and behaviors. In practice, applications of functional magnetic resonance imaging (fMRI) have provided novel insights into the neuropathophysiology of major psychiatric,neurological, and substance abuse disorders, as well as into the neural responses to their treatments. Modern activation studies often compare localized task-induced changes in brain activity between experimental groups. One may also extend voxel-level analyses by simultaneously considering the ensemble of voxels constituting an anatomically defined region of interest (ROI) or by considering means or quantiles of the ROI. In this work we present a Bayesian extension of voxel-level analyses that offers several notable benefits. First, it combines whole-brain voxel-by-voxel modeling and ROI analyses within a unified framework. Secondly, an unstructured variance/covariance for regional mean parameters allows for the study of inter-regional functional connectivity, provided enough subjects are available to allow for accurate estimation. Finally, an exchangeable correlation structure within regions allows for the consideration of intra-regional functional connectivity. We perform estimation for our model using Markov Chain Monte Carlo (MCMC) techniques implemented via Gibbs sampling which, despite the high throughput nature of the data, can be executed quickly (less than 30 minutes). We apply our Bayesian hierarchical model to two novel fMRI data sets: one considering inhibitory control in cocaine-dependent men and the second considering verbal memory in subjects at high risk for Alzheimer’s disease. The unifying hierarchical model presented in this manuscript is shown to enhance the interpretation content of these data sets.

MODIFIED TEST STATISTICS BY INTER-VOXEL VARIANCE SHRINKAGE WITH AN APPLICATION TO fMRI

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Functional Magnetic Resonance Imaging (fMRI) is a non-invasive technique which is commonly used to quantify changes in blood oxygenation and flow coupled to neuronal activation. One of the primary goals of fMRI studies is to identify localized brain regions where neuronal activation levels vary between groups. Single voxel t-tests have been commonly used to determine whether activation related to the protocol differs across groups. Due to the generally limited number of subjects within each study, accurate estimation of variance at each voxel is difficult. Thus, combining information across voxels in the statistical analysis of fMRI data is desirable in order to improve efficiency. Here we construct a hierarchical model and apply an Empirical Bayes framework on the analysis of group fMRI data, employing techniques used in high throughput genomic studies. The key idea is to shrink residual variances by combining information across voxels, and subsequently to construct an improved test statistic in lieu of the classical t-statistic. This hierarchical model results in a shrinkage of voxel-wise residual sample variances towards a common value. The shrunken estimator for voxelspecific variance components on the group analyses outperforms the classical residual error estimator in terms of mean squared error. Moreover, the shrunken test-statistic decreases false positive rate when testing differences in brain contrast maps across a wide range of simulation studies. This methodology was also applied to experimental data regarding a cognitive activation task.

USING THE R PACKAGE crlmm FOR GENOTYPING AND COPY NUMBER ESTIMATION

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Genotyping platforms such as Affymetrix can be used to assess genotype-phenotype as well as copy number-phenotype associations at millions of markers. While genotyping algorithms are largely concordant when assessed on HapMap samples, tools to assess copy number changes are more variable and often discordant. One explanation for the discordance is that copy number estimates are susceptible to systematic differences between groups of samples that were processed at different times or by different labs. Analysis algorithms that do not adjust for batch effects are prone to spurious measures of association. The R package crlmm implements a multilevel model that adjusts for batch effects and provides allele-specific estimates of copy number. This paper illustrates a workflow for the estimation of allele-specific copy number, develops markerand study-level summaries of batch effects, and demonstrates how the marker-level estimates can be integrated with complimentary Bioconductor software for inferring regions of copy number gain or loss. All analyses are performed in the statistical environment R. A compendium for reproducing the analysis is available from the author’s website (http://www.biostat.jhsph.edu/~rscharpf/crlmmCompendium/index.html).

Pathotyping Escherichia coli by using miniaturized DNA microarrays

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The detection of virulence determinants harbored by pathogenic Escherichia coli is important for establishing the pathotype responsible for infection. A sensitive and specific miniaturized virulence microarray containing 60 oligonucleotide probes was developed. It detected six E. coli pathotypes and will be suitable in the future for high-throughput use.

How reliable and robust are current biomarkers for copper status?

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cu is an essential nutrient for man, but can be toxic if intakes are too high. In sensitive populations, marginal over- or under-exposure can have detrimental effects. Malnourished children, the elderly, and pregnant or lactating females may be susceptible for Cu deficiency. Cu status and exposure in the population can currently not be easily measured, as neither plasma Cu nor plasma cuproenzymes reflect Cu status precisely. Some blood markers (such as ceruloplasmin) indicate severe Cu depletion, but do not inversely respond to Cu excess, and are not suitable to indicate marginal states. A biomarker of Cu is needed that is sensitive to small changes in Cu status, and that responds to Cu excess as well as deficiency. Such a marker will aid in monitoring Cu status in large populations, and will help to avoid chronic health effects (for example, liver damage in chronic toxicity, osteoporosis, loss of collagen stability, or increased susceptibility to infections in deficiency). The advent of high-throughput technologies has enabled us to screen for potential biomarkers in the whole proteome of a cell, not excluding markers that have no direct link to Cu. Further, this screening allows us to search for a whole group of proteins that, in combination, reflect Cu status. The present review emphasises the need to find sensitive biomarkers for Cu, examines potential markers of Cu status already available, and discusses methods to identify a novel suite of biomarkers.

Chemotherapeutic strategies against Trypanosoma brucei: drug targets vs. drug targeting

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Trypanosoma brucei rhodesiense and T. b. gambiense are the causative agents of sleeping sickness, a fatal disease that affects 36 countries in sub-Saharan Africa. Nevertheless, only a handful of clinically useful drugs are available. These drugs suffer from severe side-effects. The situation is further aggravated by the alarming incidence of treatment failures in several sleeping sickness foci, apparently indicating the occurrence of drug-resistant trypanosomes. Because of these reasons, and since vaccination does not appear to be feasible due to the trypanosomes' ever changing coat of variable surface glycoproteins (VSGs), new drugs are needed urgently. The entry of Trypanosoma brucei into the post-genomic age raises hopes for the identification of novel kinds of drug targets and in turn new treatments for sleeping sickness. The pragmatic definition of a drug target is, a protein that is essential for the parasite and does not have homologues in the host. Such proteins are identified by comparing the predicted proteomes of T. brucei and Homo sapiens, then validated by large-scale gene disruption or gene silencing experiments in trypanosomes. Once all proteins that are essential and unique to the parasite are identified, inhibitors may be found by high-throughput screening. However powerful, this functional genomics approach is going to miss a number of attractive targets. Several current, successful parasiticides attack proteins that have close homologues in the human proteome. Drugs like DFMO or pyrimethamine inhibit parasite and host enzymes alike--a therapeutic window is opened only by subtle differences in the regulation of the targets, which cannot be recognized in silico. Working against the post-genomic approach is also the fact that essential proteins tend to be more highly conserved between species than non-essential ones. Here we advocate drug targeting, i.e. uptake or activation of a drug via parasite-specific pathways, as a chemotherapeutic strategy to selectively inhibit enzymes that have equally sensitive counterparts in the host. The T. brucei purine salvage machinery offers opportunities for both metabolic and transport-based targeting: unusual nucleoside and nucleobase permeases may be exploited for selective import, salvage enzymes for selective activation of purine antimetabolites.

Novel approach for genotyping varicella-zoster virus strains from Germany

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this study, we present a novel genotyping scheme to classify German wild-type varicella-zoster virus (VZV) strains and to differentiate them from the Oka vaccine strain (genotype B). This approach is based on analysis of four loci in open reading frames (ORFs) 51 to 58, encompassing a total length of 1,990 bp. The new genotyping scheme produced identical clusters in phylogenetic analyses compared to full-genome sequences from well-characterized VZV strains. Based on genotype A, D, B, and C reference strains, a dichotomous identification key (DIK) was developed and applied for VZV strains obtained from vesicle fluid and liquor samples originating from 42 patients suffering from varicella or zoster between 2003 and 2006. Sequencing of regions in ORFs 51, 52, 53, 56, 57, and 58 identified 18 single-nucleotide polymorphisms (SNPs), including two novel ones, SNP 89727 and SNP 92792 in ORF51 and ORF52, respectively. The DIK as well as phylogenetic analysis by Bayesian inference showed that 14 VZV strains belonged to genotype A, and 28 VZV strains were classified as genotype D. Neither Japanese (vaccine)-like B strains nor recombinant-like C strains were found within the samples from Germany. The novel genotyping scheme and the DIK were demonstrated to be practical and simple and allow the highly efficient replication of phylogenetic patterns in VZV initially derived from full-genome DNA sequence analyses. Therefore, this approach may allow us to draw a more comprehensive picture of wild-type VZV strains circulating in Germany and Central Europe by high-throughput procedures in the future.

A Smart TCP Acknowledgment Approach for Multihop Wireless Networks

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Reliable data transfer is one of the most difficult tasks to be accomplished in multihop wireless networks. Traditional transport protocols like TCP face severe performance degradation over multihop networks given the noisy nature of wireless media as well as unstable connectivity conditions in place. The success of TCP in wired networks motivates its extension to wireless networks. A crucial challenge faced by TCP over these networks is how to operate smoothly with the 802.11 wireless MAC protocol which also implements a retransmission mechanism at link level in addition to short RTS/CTS control frames for avoiding collisions. These features render TCP acknowledgments (ACK) transmission quite costly. Data and ACK packets cause similar medium access overheads despite the much smaller size of the ACKs. In this paper, we further evaluate our dynamic adaptive strategy for reducing ACK-induced overhead and consequent collisions. Our approach resembles the sender side's congestion control. The receiver is self-adaptive by delaying more ACKs under nonconstrained channels and less otherwise. This improves not only throughput but also power consumption. Simulation evaluations exhibit significant improvement in several scenarios

Twelve factors influencing sustainable recycling of municipal solid waste in developing countries

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sustainable management of solid waste is a global concern, as exemplified by the United Nations Millennium Development Goals (MDG) that 191 member states support. The seventh MDG indirectly advocates for municipal solid waste management (MSWM) by aiming to ensure environmental sustainability into countries’ policies and programs and reverse negative environmental impact. Proper MSWM will likely result in relieving poverty, reducing child mortality, improving maternal health, and preventing disease, which are MDG goals one, four, five, and six, respectively (UNMDG, 2005). Solid waste production is increasing worldwide as the global society strives to obtain a decent quality of life. Several means exist in which the amount of solid waste going to a landfill can be reduced, such as incineration with energy production, composting of organic wastes, and material recovery through recycling, which are all considered sustainable methods by which to manage MSW. In the developing world, composting is already a widely-accepted method to reduce waste fated for the landfill, and incineration for energy recovery can be a costly capital investment for most communities. Therefore, this research focuses on recycling as a solution to the municipal solid waste production problem while considering the three dimensions of sustainability environment, society, and economy. First, twenty-three developing country case studies were quantitatively and qualitatively examined for aspects of municipal solid waste management. The municipal solid waste (MSW) generation and recovery rates, as well as the composition were compiled and assessed. The average MSW generation rate was 0.77 kg/person/day, with recovery rates varying from 5 – 40%. The waste streams of nineteen of these case studies consisted of 0 – 70% recyclable material and 17 – 80% organic material. All twenty-three case studies were analyzed qualitatively by identifying any barriers or incentives to recycling, which justified the creation of twelve factors influencing sustainable municipal solid waste management (MSWM) in developing countries. The presence of regulations, enforcement of laws, and use of incentive schemes constitutes the first factor, Government Policy. Cost of MSWM operations, the budget allocated to MSWM by local to national governments, as well as the stability and reliability of funds comprise the Government Finances factor influencing recycling in the third world. Many case studies indicated that understanding features of a waste stream such as the generation and recovery rates and composition is the first measure in determining proper management solutions, which forms the third factor Waste Characterization. The presence and efficiency of waste collection and segregation by scavengers, municipalities, or private contractors was commonly addressed by the case studies, which justified Waste Collection and Segregation as the fourth factor. Having knowledge of MSWM and an understanding of the linkages between human behavior, waste handling, and health/sanitation/environment comprise the Household Education factor. Individuals’ income influencing waste handling behavior (e.g., reuse, recycling, and illegal dumping), presence of waste collection/disposal fees, and willingness to pay by residents were seen as one of the biggest incentives to recycling, which justified them being combined into the Household Economics factor. The MSWM Administration factor was formed following several references to the presence and effectiveness of private and/or public management of waste through collection, recovery, and disposal influencing recycling activity. Although the MSWM Personnel Education factor was only recognized by six of the twenty-two case studies, the lack of trained laborers and skilled professionals in MSWM positions was a barrier to sustainable MSWM in every case but one. The presence and effectiveness of a comprehensive, integrative, long-term MSWM strategy was highly encouraged by every case study that addressed the tenth factor, MSWM Plan. Although seemingly a subset of private MSWM administration, the existence and profitability of market systems relying on recycled-material throughput, involvement of small businesses, middlemen, and large industries/exporters is deserving of the factor Local Recycled-Material Market. Availability and effective use of technology and/or human workforce and the safety considerations of each were recurrent barriers and incentives to recycling to warrant the Technological and Human Resources factor. The Land Availability factor takes into consideration land attributes such as terrain, ownership, and development which can often times dictate MSWM. Understanding the relationships among the twelve factors influencing recycling in developing countries, made apparent the collaborative nature required of sustainable MSWM. Factors requiring the greatest collaborative inputs include waste collection and segregation, MSWM plan, and local recycled-material market. Aligning each factor to the societal, environmental, and economic dimensions of sustainability revealed the motives behind the institutions contributing to each factor. A correlation between stakeholder involvement and sustainability existed, as supported by the fact that the only three factors driven by all three dimensions of sustainability were the same three that required the greatest collaboration with other factors. With increasing urbanization, advocating for improved health for all through the MDG, and changing consumption patterns resulting in increasing and more complex waste streams, the utilization of the collaboration web offered by this research is ever needed in the developing world. Through its use, the institutions associated with each of the twelve factors can achieve a better understanding of the collaboration necessary and beneficial for more sustainable MSWM.

HIGH PERFORMANCE, LOW COST SUBSPACE DECOMPOSITION AND POLYNOMIAL ROOTING FOR REAL TIME DIRECTION OF ARRIVAL ESTIMATION: ANALYSIS AND IMPLEMENTATION

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis develops high performance real-time signal processing modules for direction of arrival (DOA) estimation for localization systems. It proposes highly parallel algorithms for performing subspace decomposition and polynomial rooting, which are otherwise traditionally implemented using sequential algorithms. The proposed algorithms address the emerging need for real-time localization for a wide range of applications. As the antenna array size increases, the complexity of signal processing algorithms increases, making it increasingly difficult to satisfy the real-time constraints. This thesis addresses real-time implementation by proposing parallel algorithms, that maintain considerable improvement over traditional algorithms, especially for systems with larger number of antenna array elements. Singular value decomposition (SVD) and polynomial rooting are two computationally complex steps and act as the bottleneck to achieving real-time performance. The proposed algorithms are suitable for implementation on field programmable gated arrays (FPGAs), single instruction multiple data (SIMD) hardware or application specific integrated chips (ASICs), which offer large number of processing elements that can be exploited for parallel processing. The designs proposed in this thesis are modular, easily expandable and easy to implement. Firstly, this thesis proposes a fast converging SVD algorithm. The proposed method reduces the number of iterations it takes to converge to correct singular values, thus achieving closer to real-time performance. A general algorithm and a modular system design are provided making it easy for designers to replicate and extend the design to larger matrix sizes. Moreover, the method is highly parallel, which can be exploited in various hardware platforms mentioned earlier. A fixed point implementation of proposed SVD algorithm is presented. The FPGA design is pipelined to the maximum extent to increase the maximum achievable frequency of operation. The system was developed with the objective of achieving high throughput. Various modern cores available in FPGAs were used to maximize the performance and details of these modules are presented in detail. Finally, a parallel polynomial rooting technique based on Newton’s method applicable exclusively to root-MUSIC polynomials is proposed. Unique characteristics of root-MUSIC polynomial’s complex dynamics were exploited to derive this polynomial rooting method. The technique exhibits parallelism and converges to the desired root within fixed number of iterations, making this suitable for polynomial rooting of large degree polynomials. We believe this is the first time that complex dynamics of root-MUSIC polynomial were analyzed to propose an algorithm. In all, the thesis addresses two major bottlenecks in a direction of arrival estimation system, by providing simple, high throughput, parallel algorithms.

«
1
2
...
53
54
55
56
57
58
59
60
61
»