13 resultados para multiple data

em Duke University


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Long term, high quality estimates of burned area are needed for improving both prognostic and diagnostic fire emissions models and for assessing feedbacks between fire and the climate system. We developed global, monthly burned area estimates aggregated to 0.5° spatial resolution for the time period July 1996 through mid-2009 using four satellite data sets. From 2001ĝ€ "2009, our primary data source was 500-m burned area maps produced using Moderate Resolution Imaging Spectroradiometer (MODIS) surface reflectance imagery; more than 90% of the global area burned during this time period was mapped in this fashion. During times when the 500-m MODIS data were not available, we used a combination of local regression and regional regression trees developed over periods when burned area and Terra MODIS active fire data were available to indirectly estimate burned area. Cross-calibration with fire observations from the Tropical Rainfall Measuring Mission (TRMM) Visible and Infrared Scanner (VIRS) and the Along-Track Scanning Radiometer (ATSR) allowed the data set to be extended prior to the MODIS era. With our data set we estimated that the global annual area burned for the years 1997ĝ€ "2008 varied between 330 and 431 Mha, with the maximum occurring in 1998. We compared our data set to the recent GFED2, L3JRC, GLOBCARBON, and MODIS MCD45A1 global burned area products and found substantial differences in many regions. Lastly, we assessed the interannual variability and long-term trends in global burned area over the past 13 years. This burned area time series serves as the basis for the third version of the Global Fire Emissions Database (GFED3) estimates of trace gas and aerosol emissions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a novel data-delivery method for delay-sensitive traffic that significantly reduces the energy consumption in wireless sensor networks without reducing the number of packets that meet end-to-end real-time deadlines. The proposed method, referred to as SensiQoS, leverages the spatial and temporal correlation between the data generated by events in a sensor network and realizes energy savings through application-specific in-network aggregation of the data. SensiQoS maximizes energy savings by adaptively waiting for packets from upstream nodes to perform in-network processing without missing the real-time deadline for the data packets. SensiQoS is a distributed packet scheduling scheme, where nodes make localized decisions on when to schedule a packet for transmission to meet its end-to-end real-time deadline and to which neighbor they should forward the packet to save energy. We also present a localized algorithm for nodes to adapt to network traffic to maximize energy savings in the network. Simulation results show that SensiQoS improves the energy savings in sensor networks where events are sensed by multiple nodes, and spatial and/or temporal correlation exists among the data packets. Energy savings due to SensiQoS increase with increase in the density of the sensor nodes and the size of the sensed events. © 2010 Harshavardhan Sabbineni and Krishnendu Chakrabarty.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Ganglioside biosynthesis occurs through a multi-enzymatic pathway which at the lactosylceramide step is branched into several biosynthetic series. Lc3 synthase utilizes a variety of galactose-terminated glycolipids as acceptors by establishing a glycosidic bond in the beta-1,3-linkage to GlcNaAc to extend the lacto- and neolacto-series gangliosides. In order to examine the lacto-series ganglioside functions in mice, we used gene knockout technology to generate Lc3 synthase gene B3gnt5-deficient mice by two different strategies and compared the phenotypes of the two null mouse groups with each other and with their wild-type counterparts. RESULTS: B3gnt5 gene knockout mutant mice appeared normal in the embryonic stage and, if they survived delivery, remained normal during early life. However, about 9% developed early-stage growth retardation, 11% died postnatally in less than 2 months, and adults tended to die in 5-15 months, demonstrating splenomegaly and notably enlarged lymph nodes. Without lacto-neolacto series gangliosides, both homozygous and heterozygous mice gradually displayed fur loss or obesity, and breeding mice demonstrated reproductive defects. Furthermore, B3gnt5 gene knockout disrupted the functional integrity of B cells, as manifested by a decrease in B-cell numbers in the spleen, germinal center disappearance, and less efficiency to proliferate in hybridoma fusion. CONCLUSIONS: These novel results demonstrate unequivocally that lacto-neolacto series gangliosides are essential to multiple physiological functions, especially the control of reproductive output, and spleen B-cell abnormality. We also report the generation of anti-IgG response against the lacto-series gangliosides 3'-isoLM1 and 3',6'-isoLD1.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Biological processes occur on a vast range of time scales, and many of them occur concurrently. As a result, system-wide measurements of gene expression have the potential to capture many of these processes simultaneously. The challenge however, is to separate these processes and time scales in the data. In many cases the number of processes and their time scales is unknown. This issue is particularly relevant to developmental biologists, who are interested in processes such as growth, segmentation and differentiation, which can all take place simultaneously, but on different time scales. RESULTS: We introduce a flexible and statistically rigorous method for detecting different time scales in time-series gene expression data, by identifying expression patterns that are temporally shifted between replicate datasets. We apply our approach to a Saccharomyces cerevisiae cell-cycle dataset and an Arabidopsis thaliana root developmental dataset. In both datasets our method successfully detects processes operating on several different time scales. Furthermore we show that many of these time scales can be associated with particular biological functions. CONCLUSIONS: The spatiotemporal modules identified by our method suggest the presence of multiple biological processes, acting at distinct time scales in both the Arabidopsis root and yeast. Using similar large-scale expression datasets, the identification of biological processes acting at multiple time scales in many organisms is now possible.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Monogamy, together with abstinence, partner reduction, and condom use, is widely advocated as a key behavioral strategy to prevent HIV infection in sub-Saharan Africa. We examined the association between the number of sexual partners and the risk of HIV seropositivity among men and women presenting for HIV voluntary counseling and testing (VCT) in northern Tanzania. METHODOLOGY/ PRINCIPAL FINDINGS: Clients presenting for HIV VCT at a community-based AIDS service organization in Moshi, Tanzania were surveyed between November 2003 and December 2007. Data on sociodemographic characteristics, reasons for testing, sexual behaviors, and symptoms were collected. Men and women were categorized by number of lifetime sexual partners, and rates of seropositivity were reported by category. Factors associated with HIV seropositivity among monogamous males and females were identified by a multivariate logistic regression model. Of 6,549 clients, 3,607 (55%) were female, and the median age was 30 years (IQR 24-40). 939 (25%) females and 293 (10%) males (p<0.0001) were HIV seropositive. Among 1,244 (34%) monogamous females and 423 (14%) monogamous males, the risk of HIV infection was 19% and 4%, respectively (p<0.0001). The risk increased monotonically with additional partners up to 45% (p<0.001) and 15% (p<0.001) for women and men, respectively with 5 or more partners. In multivariate analysis, HIV seropositivity among monogamous women was most strongly associated with age (p<0.0001), lower education (p<0.004), and reporting a partner with other partners (p = 0.015). Only age was a significant risk factor for monogamous men (p = 0.0004). INTERPRETATION: Among women presenting for VCT, the number of partners is strongly associated with rates of seropositivity; however, even women reporting lifetime monogamy have a high risk for HIV infection. Partner reduction should be coupled with efforts to place tools in the hands of sexually active women to reduce their risk of contracting HIV.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Dropouts and missing data are nearly-ubiquitous in obesity randomized controlled trails, threatening validity and generalizability of conclusions. Herein, we meta-analytically evaluate the extent of missing data, the frequency with which various analytic methods are employed to accommodate dropouts, and the performance of multiple statistical methods. METHODOLOGY/PRINCIPAL FINDINGS: We searched PubMed and Cochrane databases (2000-2006) for articles published in English and manually searched bibliographic references. Articles of pharmaceutical randomized controlled trials with weight loss or weight gain prevention as major endpoints were included. Two authors independently reviewed each publication for inclusion. 121 articles met the inclusion criteria. Two authors independently extracted treatment, sample size, drop-out rates, study duration, and statistical method used to handle missing data from all articles and resolved disagreements by consensus. In the meta-analysis, drop-out rates were substantial with the survival (non-dropout) rates being approximated by an exponential decay curve (e(-lambdat)) where lambda was estimated to be .0088 (95% bootstrap confidence interval: .0076 to .0100) and t represents time in weeks. The estimated drop-out rate at 1 year was 37%. Most studies used last observation carried forward as the primary analytic method to handle missing data. We also obtained 12 raw obesity randomized controlled trial datasets for empirical analyses. Analyses of raw randomized controlled trial data suggested that both mixed models and multiple imputation performed well, but that multiple imputation may be more robust when missing data are extensive. CONCLUSION/SIGNIFICANCE: Our analysis offers an equation for predictions of dropout rates useful for future study planning. Our raw data analyses suggests that multiple imputation is better than other methods for handling missing data in obesity randomized controlled trials, followed closely by mixed models. We suggest these methods supplant last observation carried forward as the primary method of analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The inherent complexity of statistical methods and clinical phenomena compel researchers with diverse domains of expertise to work in interdisciplinary teams, where none of them have a complete knowledge in their counterpart's field. As a result, knowledge exchange may often be characterized by miscommunication leading to misinterpretation, ultimately resulting in errors in research and even clinical practice. Though communication has a central role in interdisciplinary collaboration and since miscommunication can have a negative impact on research processes, to the best of our knowledge, no study has yet explored how data analysis specialists and clinical researchers communicate over time. METHODS/PRINCIPAL FINDINGS: We conducted qualitative analysis of encounters between clinical researchers and data analysis specialists (epidemiologist, clinical epidemiologist, and data mining specialist). These encounters were recorded and systematically analyzed using a grounded theory methodology for extraction of emerging themes, followed by data triangulation and analysis of negative cases for validation. A policy analysis was then performed using a system dynamics methodology looking for potential interventions to improve this process. Four major emerging themes were found. Definitions using lay language were frequently employed as a way to bridge the language gap between the specialties. Thought experiments presented a series of "what if" situations that helped clarify how the method or information from the other field would behave, if exposed to alternative situations, ultimately aiding in explaining their main objective. Metaphors and analogies were used to translate concepts across fields, from the unfamiliar to the familiar. Prolepsis was used to anticipate study outcomes, thus helping specialists understand the current context based on an understanding of their final goal. CONCLUSION/SIGNIFICANCE: The communication between clinical researchers and data analysis specialists presents multiple challenges that can lead to errors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Systemic challenges within child welfare have prompted many states to explore new strategies aimed at protecting children while meeting the needs of families, but doing so within the confines of shrinking budgets. Differential Response has emerged as a promising practice for low or moderate risk cases of child maltreatment. This mixed methods evaluation explored various aspects of North Carolina's differential response system, known as the Multiple Response System (MRS), including: child safety, timeliness of response and case decision, frontloading of services, case distribution, implementation of Child and Family Teams, collaboration with community-based service providers and Shared Parenting. Utilizing Child Protective Services (CPS) administrative data, researchers found that compared to matched control counties, MRS: had a positive impact on child safety evidenced by a decline in the rates of substantiations and re-assessments; temporarily disrupted timeliness of response in pilot counties but had no effect on time to case decision; and increased the number of upfront services provided to families during assessment. Qualitative data collected through focus groups with providers and phone interviews with families provided important information on key MRS strategies, highlighting aspects that families and social workers like as well as identifying areas for improvement. This information is useful for continuous quality improvement efforts, particularly related to the development of training and technical assistance programs at the state and local level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An enterprise information system (EIS) is an integrated data-applications platform characterized by diverse, heterogeneous, and distributed data sources. For many enterprises, a number of business processes still depend heavily on static rule-based methods and extensive human expertise. Enterprises are faced with the need for optimizing operation scheduling, improving resource utilization, discovering useful knowledge, and making data-driven decisions.

This thesis research is focused on real-time optimization and knowledge discovery that addresses workflow optimization, resource allocation, as well as data-driven predictions of process-execution times, order fulfillment, and enterprise service-level performance. In contrast to prior work on data analytics techniques for enterprise performance optimization, the emphasis here is on realizing scalable and real-time enterprise intelligence based on a combination of heterogeneous system simulation, combinatorial optimization, machine-learning algorithms, and statistical methods.

On-demand digital-print service is a representative enterprise requiring a powerful EIS.We use real-life data from Reischling Press, Inc. (RPI), a digit-print-service provider (PSP), to evaluate our optimization algorithms.

In order to handle the increase in volume and diversity of demands, we first present a high-performance, scalable, and real-time production scheduling algorithm for production automation based on an incremental genetic algorithm (IGA). The objective of this algorithm is to optimize the order dispatching sequence and balance resource utilization. Compared to prior work, this solution is scalable for a high volume of orders and it provides fast scheduling solutions for orders that require complex fulfillment procedures. Experimental results highlight its potential benefit in reducing production inefficiencies and enhancing the productivity of an enterprise.

We next discuss analysis and prediction of different attributes involved in hierarchical components of an enterprise. We start from a study of the fundamental processes related to real-time prediction. Our process-execution time and process status prediction models integrate statistical methods with machine-learning algorithms. In addition to improved prediction accuracy compared to stand-alone machine-learning algorithms, it also performs a probabilistic estimation of the predicted status. An order generally consists of multiple series and parallel processes. We next introduce an order-fulfillment prediction model that combines advantages of multiple classification models by incorporating flexible decision-integration mechanisms. Experimental results show that adopting due dates recommended by the model can significantly reduce enterprise late-delivery ratio. Finally, we investigate service-level attributes that reflect the overall performance of an enterprise. We analyze and decompose time-series data into different components according to their hierarchical periodic nature, perform correlation analysis,

and develop univariate prediction models for each component as well as multivariate models for correlated components. Predictions for the original time series are aggregated from the predictions of its components. In addition to a significant increase in mid-term prediction accuracy, this distributed modeling strategy also improves short-term time-series prediction accuracy.

In summary, this thesis research has led to a set of characterization, optimization, and prediction tools for an EIS to derive insightful knowledge from data and use them as guidance for production management. It is expected to provide solutions for enterprises to increase reconfigurability, accomplish more automated procedures, and obtain data-driven recommendations or effective decisions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders, and used the genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomics analyses (Jarvis et al. in press; Zhang et al. in press). Here we release assemblies and datasets associated with the comparative genome analyses, which include 38 newly sequenced avian genomes plus previously released or simultaneously released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck, Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that this resource will serve future efforts in phylogenomics and comparative genomics. FINDINGS: The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform and assembled using a whole genome shotgun strategy. The 48 genomes were categorized into two groups according to the N50 scaffold size of the assemblies: a high depth group comprising 23 species sequenced at high coverage (>50X) with multiple insert size libraries resulting in N50 scaffold sizes greater than 1 Mb (except the White-throated Tinamou and Bald Eagle); and a low depth group comprising 25 species sequenced at a low coverage (~30X) with two insert size libraries resulting in an average N50 scaffold size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein coding genes in each avian genome relative to chicken, zebra finch and human, as well as comparative and sequence conservation analyses. CONCLUSIONS: Here we release full genome assemblies of 38 newly sequenced avian species, link genome assembly downloads for the 7 of the remaining 10 species, and provide a guideline of genomic data that has been generated and used in our Avian Phylogenomics Project. To the best of our knowledge, the Avian Phylogenomics Project is the biggest vertebrate comparative genomics project to date. The genomic data presented here is expected to accelerate further analyses in many fields, including phylogenetics, comparative genomics, evolution, neurobiology, development biology, and other related areas.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Interest in structural brain connectivity has grown with the understanding that abnormal neural connections may play a role in neurologic and psychiatric diseases. Small animal connectivity mapping techniques are particularly important for identifying aberrant connectivity in disease models. Diffusion magnetic resonance imaging tractography can provide nondestructive, 3D, brain-wide connectivity maps, but has historically been limited by low spatial resolution, low signal-to-noise ratio, and the difficulty in estimating multiple fiber orientations within a single image voxel. Small animal diffusion tractography can be substantially improved through the combination of ex vivo MRI with exogenous contrast agents, advanced diffusion acquisition and reconstruction techniques, and probabilistic fiber tracking. Here, we present a comprehensive, probabilistic tractography connectome of the mouse brain at microscopic resolution, and a comparison of these data with a neuronal tracer-based connectivity data from the Allen Brain Atlas. This work serves as a reference database for future tractography studies in the mouse brain, and demonstrates the fundamental differences between tractography and neuronal tracer data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The Affordable Care Act encourages healthcare systems to integrate behavioral and medical healthcare, as well as to employ electronic health records (EHRs) for health information exchange and quality improvement. Pragmatic research paradigms that employ EHRs in research are needed to produce clinical evidence in real-world medical settings for informing learning healthcare systems. Adults with comorbid diabetes and substance use disorders (SUDs) tend to use costly inpatient treatments; however, there is a lack of empirical data on implementing behavioral healthcare to reduce health risk in adults with high-risk diabetes. Given the complexity of high-risk patients' medical problems and the cost of conducting randomized trials, a feasibility project is warranted to guide practical study designs. METHODS: We describe the study design, which explores the feasibility of implementing substance use Screening, Brief Intervention, and Referral to Treatment (SBIRT) among adults with high-risk type 2 diabetes mellitus (T2DM) within a home-based primary care setting. Our study includes the development of an integrated EHR datamart to identify eligible patients and collect diabetes healthcare data, and the use of a geographic health information system to understand the social context in patients' communities. Analysis will examine recruitment, proportion of patients receiving brief intervention and/or referrals, substance use, SUD treatment use, diabetes outcomes, and retention. DISCUSSION: By capitalizing on an existing T2DM project that uses home-based primary care, our study results will provide timely clinical information to inform the designs and implementation of future SBIRT studies among adults with multiple medical conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We recently developed an approach for testing the accuracy of network inference algorithms by applying them to biologically realistic simulations with known network topology. Here, we seek to determine the degree to which the network topology and data sampling regime influence the ability of our Bayesian network inference algorithm, NETWORKINFERENCE, to recover gene regulatory networks. NETWORKINFERENCE performed well at recovering feedback loops and multiple targets of a regulator with small amounts of data, but required more data to recover multiple regulators of a gene. When collecting the same number of data samples at different intervals from the system, the best recovery was produced by sampling intervals long enough such that sampling covered propagation of regulation through the network but not so long such that intervals missed internal dynamics. These results further elucidate the possibilities and limitations of network inference based on biological data.