981 resultados para Brock University -- Dept. of Geological Sciences


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a collection of R packages for conducting and distributing reproducible research using R, Sweave, and LaTeX. The collection consists of the cacheSweave, stashR, and SRPM packages which allow for the caching of computations in Sweave documents and the distribution of those cached computations via remotely accessible key-value databases. We describe the caching mechanism used by the cacheSweave package and tools that we have developed for authors and readers for the purposes of creating and interacting with reproducible documents.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Functional Magnetic Resonance Imaging (fMRI) is a non-invasive technique which is commonly used to quantify changes in blood oxygenation and flow coupled to neuronal activation. One of the primary goals of fMRI studies is to identify localized brain regions where neuronal activation levels vary between groups. Single voxel t-tests have been commonly used to determine whether activation related to the protocol differs across groups. Due to the generally limited number of subjects within each study, accurate estimation of variance at each voxel is difficult. Thus, combining information across voxels in the statistical analysis of fMRI data is desirable in order to improve efficiency. Here we construct a hierarchical model and apply an Empirical Bayes framework on the analysis of group fMRI data, employing techniques used in high throughput genomic studies. The key idea is to shrink residual variances by combining information across voxels, and subsequently to construct an improved test statistic in lieu of the classical t-statistic. This hierarchical model results in a shrinkage of voxel-wise residual sample variances towards a common value. The shrunken estimator for voxelspecific variance components on the group analyses outperforms the classical residual error estimator in terms of mean squared error. Moreover, the shrunken test-statistic decreases false positive rate when testing differences in brain contrast maps across a wide range of simulation studies. This methodology was also applied to experimental data regarding a cognitive activation task.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genotyping platforms such as Affymetrix can be used to assess genotype-phenotype as well as copy number-phenotype associations at millions of markers. While genotyping algorithms are largely concordant when assessed on HapMap samples, tools to assess copy number changes are more variable and often discordant. One explanation for the discordance is that copy number estimates are susceptible to systematic differences between groups of samples that were processed at different times or by different labs. Analysis algorithms that do not adjust for batch effects are prone to spurious measures of association. The R package crlmm implements a multilevel model that adjusts for batch effects and provides allele-specific estimates of copy number. This paper illustrates a workflow for the estimation of allele-specific copy number, develops markerand study-level summaries of batch effects, and demonstrates how the marker-level estimates can be integrated with complimentary Bioconductor software for inferring regions of copy number gain or loss. All analyses are performed in the statistical environment R. A compendium for reproducing the analysis is available from the author’s website (http://www.biostat.jhsph.edu/~rscharpf/crlmmCompendium/index.html).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this study is to develop statistical methodology to facilitate indirect estimation of the concentration of antiretroviral drugs and viral loads in the prostate gland and the seminal vesicle. The differences in antiretroviral drug concentrations in these organs may lead to suboptimal concentrations in one gland compared to the other. Suboptimal levels of the antiretroviral drugs will not be able to fully suppress the virus in that gland, lead to a source of sexually transmissible virus and increase the chance of selecting for drug resistant virus. This information may be useful selecting antiretroviral drug regimen that will achieve optimal concentrations in most of male genital tract glands. Using fractionally collected semen ejaculates, Lundquist (1949) measured levels of surrogate markers in each fraction that are uniquely produced by specific male accessory glands. To determine the original glandular concentrations of the surrogate markers, Lundquist solved a simultaneous series of linear equations. This method has several limitations. In particular, it does not yield a unique solution, it does not address measurement error, and it disregards inter-subject variability in the parameters. To cope with these limitations, we developed a mechanistic latent variable model based on the physiology of the male genital tract and surrogate markers. We employ a Bayesian approach and perform a sensitivity analysis with regard to the distributional assumptions on the random effects and priors. The model and Bayesian approach is validated on experimental data where the concentration of a drug should be (biologically) differentially distributed between the two glands. In this example, the Bayesian model-based conclusions are found to be robust to model specification and this hierarchical approach leads to more scientifically valid conclusions than the original methodology. In particular, unlike existing methods, the proposed model based approach was not affected by a common form of outliers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The stashR package (a Set of Tools for Administering SHared Repositories) for R implements a simple key-value style database where character string keys are associated with data values. The key-value databases can be either stored locally on the user's computer or accessed remotely via the Internet. Methods specific to the stashR package allow users to share data repositories or access previously created remote data repositories. In particular, methods are available for the S4 classes localDB and remoteDB to insert, retrieve, or delete data from the database as well as to synchronize local copies of the data to the remote version of the database. Users efficiently access information from a remote database by retrieving only the data files indexed by user-specified keys and caching this data in a local copy of the remote database. The local and remote counterparts of the stashR package offer the potential to enhance reproducible research by allowing users of Sweave to cache their R computations for a research paper in a localDB database. This database can then be stored on the Internet as a remoteDB database. When readers of the research paper wish to reproduce the computations involved in creating a specific figure or calculating a specific numeric value, they can access the remoteDB database and obtain the R objects involved in the computation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Prospective cohort studies have provided evidence on longer-term mortality risks of fine particulate matter (PM2.5), but due to their complexity and costs, only a few have been conducted. By linking monitoring data to the U.S. Medicare system by county of residence, we developed a retrospective cohort study, the Medicare Air Pollution Cohort Study (MCAPS), comprising over 20 million enrollees in the 250 largest counties during 2000-2002. We estimated log-linear regression models having as outcome the age-specific mortality rate for each county and as the main predictor, the average level for the study period 2000. Area-level covariates were used to adjust for socio-economic status and smoking. We reported results under several degrees of adjustment for spatial confounding and with stratification into by eastern, central and western counties. We estimated that a 10 µg/m3 increase in PM25 is associated with a 7.6% increase in mortality (95% CI: 4.4 to 10.8%). We found a stronger association in the eastern counties than nationally, with no evidence of an association in western counties. When adjusted for spatial confounding, the estimated log-relative risks drop by 50%. We demonstrated the feasibility of using Medicare data to establish cohorts for follow-up for effects of air pollution. Particulate matter (PM) air pollution is a global public health problem (1). In developing countries, levels of airborne particles still reach concentrations at which serious health consequences are well-documented; in developed countries, recent epidemiologic evidence shows continued adverse effects, even though particle levels have declined in the last two decades (2-6). Increased mortality associated with higher levels of PM air pollution has been of particular concern, giving an imperative for stronger protective regulations (7). Evidence on PM and health comes from studies of acute and chronic adverse effects (6). The London Fog of 1952 provides dramatic evidence of the unacceptable short-term risk of extremely high levels of PM air pollution (8-10); multi-site time-series studies of daily mortality show that far lower levels of particles are still associated with short-term risk (5)(11-13). Cohort studies provide complementary evidence on the longer-term risks of PM air pollution, indicating the extent to which exposure reduces life expectancy. The design of these studies involves follow-up of cohorts for mortality over periods of years to decades and an assessment of mortality risk in association with estimated long-term exposure to air pollution (2-4;14-17). Because of the complexity and costs of such studies, only a small number have been conducted. The most rigorously executed, including the Harvard Six Cities Study and the American Cancer Society’s (ACS) Cancer Prevention Study II, have provided generally consistent evidence for an association of long- term exposure to particulate matter air pollution with increased all-cause and cardio-respiratory mortality (2,4,14,15). Results from these studies have been used in risk assessments conducted for setting the U.S. National Ambient Air Quality Standard (NAAQS) for PM and for estimating the global burden of disease attributable to air pollution (18,19). Additional prospective cohort studies are necessary, however, to confirm associations between long-term exposure to PM and mortality, to broaden the populations studied, and to refine estimates by regions across which particle composition varies. Toward this end, we have used data from the U.S. Medicare system, which covers nearly all persons 65 years of age and older in the United States. We linked Medicare mortality data to (particulate matter less than 2.5 µm in aerodynamic diameter) air pollution monitoring data to create a new retrospective cohort study, the Medicare Air Pollution Cohort Study (MCAPS), consisting of 20 million persons from 250 counties and representing about 50% of the US population of elderly living in urban settings. In this paper, we report on the relationship between longer-term exposure to PM2.5 and mortality risk over the period 2000 to 2002 in the MCAPS.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we consider estimation of the causal effect of a treatment on an outcome from observational data collected in two phases. In the first phase, a simple random sample of individuals are drawn from a population. On these individuals, information is obtained on treatment, outcome, and a few low-dimensional confounders. These individuals are then stratified according to these factors. In the second phase, a random sub-sample of individuals are drawn from each stratum, with known, stratum-specific selection probabilities. On these individuals, a rich set of confounding factors are collected. In this setting, we introduce four estimators: (1) simple inverse weighted, (2) locally efficient, (3) doubly robust and (4)enriched inverse weighted. We evaluate the finite-sample performance of these estimators in a simulation study. We also use our methodology to estimate the causal effect of trauma care on in-hospital mortality using data from the National Study of Cost and Outcomes of Trauma.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In medical follow-up studies, ordered bivariate survival data are frequently encountered when bivariate failure events are used as the outcomes to identify the progression of a disease. In cancer studies interest could be focused on bivariate failure times, for example, time from birth to cancer onset and time from cancer onset to death. This paper considers a sampling scheme where the first failure event (cancer onset) is identified within a calendar time interval, the time of the initiating event (birth) can be retrospectively confirmed, and the occurrence of the second event (death) is observed sub ject to right censoring. To analyze this type of bivariate failure time data, it is important to recognize the presence of bias arising due to interval sampling. In this paper, nonparametric and semiparametric methods are developed to analyze the bivariate survival data with interval sampling under stationary and semi-stationary conditions. Numerical studies demonstrate the proposed estimating approaches perform well with practical sample sizes in different simulated models. We apply the proposed methods to SEER ovarian cancer registry data for illustration of the methods and theory.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently the International Union of Geological Sciences (Commission on Stratigraphy, Working Group on the Paleogene/Neogene Boundary) proposed that the Oligocene/Miocene boundary be placed at the base of Chron C6Cn2n at 23.8 Ma on the Cande and Kent (1992) magnetic time scale, where it is approximated by planktic foraminifera at the first occurrence of Globorotulia kugleri, and by calcareous nannofossils at the last occurrence of Sphenolithus ciperoensis and the first and last occurrences of Sphenolithus delphix and S. capricornutus. Herein we show that, in terms of radiolarians, the base of Chron C6Cn2n can be correlated with the upper part of the Lychnocanoma elongata Zone between the last occurrence of Artophormis gracilis (23.94 Ma) and the first occurrence of Cyrtocapsella tetrapera (23.69 Ma). Since the proposed stratotype at Lemme-Carrosio (Italy) does not contain radiolarians at the boundary, we re-examined 13 DSDP sites and established the stratigraphic sequence of 29 first and last radiolarian occurrences and one evolutionary transition across the boundary. Nine of these sites contain both calcareous and siliceous microfossils and thus allow for an integrated biostratigraphy. Paleomagnetic stratigraphy is not available for any of the DSDP cores examined. However, use of Hodell and Woodruff's (1994) strontium isotope curve from DSDP Site 289 has permitted calibration of several low latitude microfossil datum levels against the geomagnetic polarity scale. Two new species, Lychnocanoma apodora and Eucyrtidium plesiodiaphanes, are described.