935 resultados para complex data


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Rio San Juan Complex is an important occurrence of high pressure/low temperature rocks in the circum-Caribbean region which contains both coherent blueschist units and two varieties of melange in the same area. The melanges contain a diverse assemblage of blocks of various sizes, different degrees of metamorphism, and mineral assemblages. Some high pressure blocks show two stages of metamorphism. The earliest stage is characterized by high pressure-low temperature conditions and the second stage is characterized by high pressure-lower temperature conditions. The geochemistry of thirteen samples from the Rio San Juan Complex has been studied and data have been compared with rocks of adjacent regions. Geochemical evidence indicates that rocks from the Rio San Juan Complex have predominant calc-alkaline affinities with subordinate tholeiitic affinities. This suggests that they have a multiple tectonic provenance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Disasters are complex events characterized by damage to key infrastructure and population displacements into disaster shelters. Assessing the living environment in shelters during disasters is a crucial health security concern. Until now, jurisdictional knowledge and preparedness on those assessment methods, or deficiencies found in shelters is limited. A cross-sectional survey (STUSA survey) ascertained knowledge and preparedness for those assessments in all 50 states, DC, and 5 US territories. Descriptive analysis of overall knowledge and preparedness was performed. Fisher’s exact statistics analyzed differences between two groups: jurisdiction type and population size. Two logistic regression models analyzed earthquakes and hurricane risks as predictors of knowledge and preparedness. A convenience sample of state shelter assessments records (n=116) was analyzed to describe environmental health deficiencies found during selected events. Overall, 55 (98%) of jurisdictions responded (states and territories) and appeared to be knowledgeable of these assessments (states 92%, territories 100%, p = 1.000), and engaged in disaster planning with shelter partners (states 96%, territories 83%, p = 0.564). Few had shelter assessment procedures (states 53%, territories 50%, p = 1.000); or training in disaster shelter assessments (states 41%, 60% territories, p = 0.638). Knowledge or preparedness was not predicted by disaster risks, population size, and jurisdiction type in neither model. Knowledge: hurricane (Adjusted OR 0.69, 95% C.I. 0.06-7.88); earthquake (OR 0.82, 95% C.I. 0.17-4.06); and both risks (OR 1.44, 95% C.I. 0.24-8.63); preparedness model: hurricane (OR 1.91, 95% C.I. 0.06-20.69); earthquake (OR 0.47, 95% C.I. 0.7-3.17); and both risks (OR 0.50, 95% C.I. 0.06-3.94). Environmental health deficiencies documented in shelter assessments occurred mostly in: sanitation (30%); facility (17%); food (15%); and sleeping areas (12%); and during ice storms and tornadoes. More research is needed in the area of environmental health assessments of disaster shelters, particularly, in those areas that may provide better insight into the living environment of all shelter occupants and potential effects in disaster morbidity and mortality. Also, to evaluate the effectiveness and usefulness of these assessments methods and the data available on environmental health deficiencies in risk management to protect those at greater risk in shelter facilities during disasters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The exponential growth of studies on the biological response to ocean acidification over the last few decades has generated a large amount of data. To facilitate data comparison, a data compilation hosted at the data publisher PANGAEA was initiated in 2008 and is updated on a regular basis (doi:10.1594/PANGAEA.149999). By January 2015, a total of 581 data sets (over 4 000 000 data points) from 539 papers had been archived. Here we present the developments of this data compilation five years since its first description by Nisumaa et al. (2010). Most of study sites from which data archived are still in the Northern Hemisphere and the number of archived data from studies from the Southern Hemisphere and polar oceans are still relatively low. Data from 60 studies that investigated the response of a mix of organisms or natural communities were all added after 2010, indicating a welcomed shift from the study of individual organisms to communities and ecosystems. The initial imbalance of considerably more data archived on calcification and primary production than on other processes has improved. There is also a clear tendency towards more data archived from multifactorial studies after 2010. For easier and more effective access to ocean acidification data, the ocean acidification community is strongly encouraged to contribute to the data archiving effort, and help develop standard vocabularies describing the variables and define best practices for archiving ocean acidification data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Buchans ore bodies of central Newfoundland represent some of the highest grade VMS deposits ever mined. These Kuroko-type deposits are also known for the well developed and preserved nature of the mechanically transported deposits. The deposits are hosted in Cambro-Ordovician, dominantly calc-alkaline, bimodal volcanic and epiclastic sequences of the Notre Dame Subzone, Newfoundland Appalachians. Stratigraphic relationships in this zone are complicated by extensively developed, brittledominated Silurian thrust faulting. Hydrothermal alteration of host rocks is a common feature of nearly all VMS deposits, and the recognition of these zones has been a key exploration tool. Alteration of host rocks has long been described to be spatially associated with the Buchans ore bodies, most notably with the larger in-situ deposits. This report represents a base-line study in which a complete documentation of the geochemical variance, in terms of both primary (igneous) and alteration effects, is presented from altered volcanic rocks in the vicinity of the Lucky Strike deposit (LSZ), the largest in-situ deposit in the Buchans camp. Packages of altered rocks also occur away from the immediate mining areas and constitute new targets for exploration. These zones, identified mostly by recent and previous drilling, represent untested targets and include the Powerhouse (PHZ), Woodmans Brook (WBZ) and Airport (APZ) alteration zones, as well as the Middle Branch alteration zone (MBZ), which represents a more distal alteration facies related to Buchans ore-formation. Data from each of these zones were compared to those from the LSZ in order to evaluate their relative propectivity. Derived litho geochemical data served two functions: (i) to define primary (igneous) trends and (ii) secondary alteration trends. Primary trends were established using immobile, or conservative, elements (i. e., HFSE, REE, Th, Ti0₂, Al₂0₃, P₂0₅). From these, altered volcanic rocks were interpreted in terms of composition (e.g., basalt - rhyodacite) and magmatic affinity (e.g., calc-alkaline vs. tholeiitic). The information suggests that bimodality is a common feature of all zones, with most rocks plotting as either basalt/andesite or dacite (or rhyodacite); andesitic senso stricto compositions are rare. Magmatic affinities are more varied and complex, but indicate that all units are arc volcanic sequences. Rocks from the LSZ/MBZ represent a transitional to calc-alkalic sequence, however, a slight shift in key geochemical discriminants occurs between the foot-wall to the hanging-wall. Specifically, mafic and felsic lavas of the foot-wall are of transitional (or mildly calc-alkaline) affinity whereas the hanging-wall rocks are relatively more strongly calc-alkaline as indicated by enriched LREE/HREE and higher ZrN, NbN and other ratios in the latter. The geochemical variations also serve as a means to separate the units (at least the felsic rocks) into hanging-wall and foot-wall sequences, therefore providing a valuable exploration tool. Volcanic rocks from the WBZ/PHZ (and probably the APZ) are more typical of tholeiitic to transitional suites, yielding flatter mantlenormalized REE patterns and lower ZrN ratios. Thus, the relationships between the immediate mining area (represented by LSZ/MBZ) and the Buchans East (PHZ/WBZ) and the APZ are uncertain. Host rocks for all zones consist of mafic to felsic volcanic rocks, though the proportion of pyroclastic and epiclastic rocks, is greatest at the LSZ. Phenocryst assemblages and textures are common in all zones, with minor exceptions, and are not useful for discrimination purposes. Felsic rocks from all zones are dominated by sericiteclay+/- silica alteration, whereas mafic rocks are dominated by chlorite- quartz- sericite alteration. Pyrite is ubiquitous in all moderately altered rocks and minor associated base metal sulphides occur locally. The exception is at Lucky Strike, where stockwork quartzveining contains abundant base-metal mineralization and barite. Rocks completely comprised of chlorite (chloritite) also occur in the LSZ foot-wall. In addition, K-feldspar alteration occurs in felsic volcanic rocks at the MBZ associated with Zn-Pb-Ba and, notably, without chlorite. This zone represents a peripheral, but proximal, zone of alteration induced by lower temperature hydrothermal fluids, presumably with little influence from seawater. Alteration geochemistry was interpreted from raw data as well as from mass balanced (recalculated) data derived from immobile element pairs. The data from the LSZ/MBZ indicate a range in the degree of alteration from only minor to severe modification of precursor compositions. Ba tends to show a strong positive correlation with K₂0, although most Ba occurs as barite. With respect to mass changes, Al₂0₃, Ti0₂ and P₂0₅ were shown to be immobile. Nearly all rocks display mass loss of Na₂O, CaO, and Sr reflecting feldspar destruction. These trends are usually mirrored by K₂0-Rb and MgO addition, indicating sericitic and chloritic alteration, respectively. More substantial gains ofK₂0 often occur in rocks with K-feldspar alteration, whereas a few samples also displayed excessive MgO enrichment and represent chloritites. Fe₂0₃ indicates both chlorite and sulphide formation. Si0₂ addition is almost always the case for the altered mafic rocks as silica often infills amygdules and replaces the finer tuffaceous material. The felsic rocks display more variability in Si0₂. Silicic, sericitic and chloritic alteration trends were observed from the other zones, but not K-feldspar, chloritite, or barite. Microprobe analysis of chlorites, sericites and carbonates indicate: (i) sericites from all zones are defined as muscovite and are not phengitic; (ii) at the LSZ, chlorites ranged from Fe-Mg chlorites (pycnochlorite) to Mg-rich chlorite (penninite), with the latter occurring in the stockwork zone and more proximal alteration facies; (iii) chlorites from the WBZ were typical of those from the more distal alteration facies of the LSZ, plotting as ripidolite to pycnochlorite; (iv) conversely, chlorite from the PHZ plot with Mg-Al-rich compositions (chlinochlore to penninite); and (v) carbonate species from each zone are also varied, with calcite occurring in each zone, in addition to dolomite and ankerite in the PHZ and WBZ, respectively. Lead isotope ratios for galena separates from the different various zones, when combined with data from older studies, tend to cluster into four distinctive fields. Overall, the data plot on a broad mixing line and indicate evolution in a relatively low-μ environment. Data from sulphide stringers in altered MBZ rocks, as well as from clastic sulphides (Sandfill prospect), plot in the Buchans ore field, as do the data for galena from altered rocks in the APZ. Samples from the Buchans East area are even more primitive than the Buchans ores, with lead from the PHZ plotting with the Connel Option prospect and data from the WBZ matching that of the Skidder prospect. A sample from a newly discovered debris flow-type sulphide occurrence (Middle Branch East) yields lead isotope ratios that are slightly more radiogenic than Buchans and plot with the Mary March alteration zone. Data within each cluster are interpreted to represent derivation from individual hydrothermal systems in which metals were derived from a common source.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Negli ultimi anni la teoria dei network è stata applicata agli ambiti più diversi, mostrando proprietà caratterizzanti tutti i network reali. In questo lavoro abbiamo applicato gli strumenti della teoria dei network a dati cerebrali ottenuti tramite MRI funzionale “resting”, provenienti da due esperimenti. I dati di fMRI sono particolarmente adatti ad essere studiati tramite reti complesse, poiché in un esperimento si ottengono tipicamente più di centomila serie temporali per ogni individuo, da più di 100 valori ciascuna. I dati cerebrali negli umani sono molto variabili e ogni operazione di acquisizione dati, così come ogni passo della costruzione del network, richiede particolare attenzione. Per ottenere un network dai dati grezzi, ogni passo nel preprocessamento è stato effettuato tramite software appositi, e anche con nuovi metodi da noi implementati. Il primo set di dati analizzati è stato usato come riferimento per la caratterizzazione delle proprietà del network, in particolare delle misure di centralità, dal momento che pochi studi a riguardo sono stati condotti finora. Alcune delle misure usate indicano valori di centralità significativi, quando confrontati con un modello nullo. Questo comportamento `e stato investigato anche a istanti di tempo diversi, usando un approccio sliding window, applicando un test statistico basato su un modello nullo pi`u complesso. Il secondo set di dati analizzato riguarda individui in quattro diversi stati di riposo, da un livello di completa coscienza a uno di profonda incoscienza. E' stato quindi investigato il potere che queste misure di centralità hanno nel discriminare tra diversi stati, risultando essere dei potenziali bio-marcatori di stati di coscienza. E’ stato riscontrato inoltre che non tutte le misure hanno lo stesso potere discriminante. Secondo i lavori a noi noti, questo `e il primo studio che caratterizza differenze tra stati di coscienza nel cervello di individui sani per mezzo della teoria dei network.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Lena River Delta, situated in Northern Siberia (72.0 - 73.8° N, 122.0 - 129.5° E), is the largest Arctic delta and covers 29,000 km**2. Since natural deltas are characterised by complex geomorphological patterns and various types of ecosystems, high spatial resolution information on the distribution and extent of the delta environments is necessary for a spatial assessment and accurate quantification of biogeochemical processes as drivers for the emission of greenhouse gases from tundra soils. In this study, the first land cover classification for the entire Lena Delta based on Landsat 7 Enhanced Thematic Mapper (ETM+) images was conducted and used for the quantification of methane emissions from the delta ecosystems on the regional scale. The applied supervised minimum distance classification was very effective with the few ancillary data that were available for training site selection. Nine land cover classes of aquatic and terrestrial ecosystems in the wetland dominated (72%) Lena Delta could be defined by this classification approach. The mean daily methane emission of the entire Lena Delta was calculated with 10.35 mg CH4/m**2/d. Taking our multi-scale approach into account we find that the methane source strength of certain tundra wetland types is lower than calculated previously on coarser scales.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Assessing frequency and extent of mass movement at continental margins is crucial to evaluate risks for offshore constructions and coastal areas. A multidisciplinary approach including geophysical, sedimentological, geotechnical, and geochemical methods was applied to investigate multistage mass transport deposits (MTDs) off Uruguay, on top of which no surficial hemipelagic drape was detected based on echosounder data. Nonsteady state pore water conditions are evidenced by a distinct gradient change in the sulfate (SO4**2-) profile at 2.8 m depth. A sharp sedimentological contact at 2.43 m coincides with an abrupt downward increase in shear strength from approx. 10 to >20 kPa. This boundary is interpreted as a paleosurface (and top of an older MTD) that has recently been covered by a sediment package during a younger landslide event. This youngest MTD supposedly originated from an upslope position and carried its initial pore water signature downward. The kink in the SO4**2- profile approx. 35 cm below the sedimentological and geotechnical contact indicates that bioirrigation affected the paleosurface before deposition of the youngest MTD. Based on modeling of the diffusive re-equilibration of SO4**2- the age of the most recent MTD is estimated to be <30 years. The mass movement was possibly related to an earthquake in 1988 (approx. 70 km southwest of the core location). Probabilistic slope stability back analysis of general landslide structures in the study area reveals that slope failure initiation requires additional ground accelerations. Therefore, we consider the earthquake as a reasonable trigger if additional weakening processes (e.g., erosion by previous retrogressive failure events or excess pore pressures) preconditioned the slope for failure. Our study reveals the necessity of multidisciplinary approaches to accurately recognize and date recent slope failures in complex settings such as the investigated area.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Permafrost degradation influences the morphology, biogeochemical cycling and hydrology of Arctic landscapes over a range of time scales. To reconstruct temporal patterns of early to late Holocene permafrost and thermokarst dynamics, site-specific palaeo-records are needed. Here we present a multi-proxy study of a 350-cm-long permafrost core from a drained lake basin on the northern Seward Peninsula, Alaska, revealing Lateglacial to Holocene thermokarst lake dynamics in a central location of Beringia. Use of radiocarbon dating, micropalaeontology (ostracods and testaceans), sedimentology (grain-size analyses, magnetic susceptibility, tephra analyses), geochemistry (total nitrogen and carbon, total organic carbon, d13Corg) and stable water isotopes (d18O, dD, d excess) of ground ice allowed the reconstruction of several distinct thermokarst lake phases. These include a pre-lacustrine environment at the base of the core characterized by the Devil Mountain Maar tephra (22 800±280 cal. a BP, Unit A), which has vertically subsided in places due to subsequent development of a deep thermokarst lake that initiated around 11 800 cal. a BP (Unit B). At about 9000 cal. a BP this lake transitioned from a stable depositional environment to a very dynamic lake system (Unit C) characterized by fluctuating lake levels, potentially intermediate wetland development, and expansion and erosion of shore deposits. Complete drainage of this lake occurred at 1060 cal. a BP, including post-drainage sediment freezing from the top down to 154 cm and gradual accumulation of terrestrial peat (Unit D), as well as uniform upward talik refreezing. This core-based reconstruction of multiple thermokarst lake generations since 11 800 cal. a BP improves our understanding of the temporal scales of thermokarst lake development from initiation to drainage, demonstrates complex landscape evolution in the ice-rich permafrost regions of Central Beringia during the Lateglacial and Holocene, and enhances our understanding of biogeochemical cycles in thermokarst-affected regions of the Arctic.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Studies assume that socioeconomic status determines individuals’ states of health, but how does health determine socioeconomic status? And how does this association vary depending on contextual differences? To answer this question, our study uses an additive Bayesian Networks model to explain the interrelationships between health and socioeconomic determinants using complex and messy data. This model has been used to find the most probable structure in a network to describe the interdependence of these factors in five European welfare state regimes. The advantage of this study is that it offers a specific picture to describe the complex interrelationship between socioeconomic determinants and health, producing a network that is controlled by socio demographic factors such as gender and age. The present work provides a general framework to describe and understand the complex association between socioeconomic determinants and health.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We thank the High-Throughput Genomics Group at the Wellcome Trust Centre for Human Genetics and the Wellcome Trust Sanger Institute for the generation of the sequencing data. This work was funded by Wellcome Trust grant 090532/Z/09/Z (J.F.). Primary phenotyping of the mice was supported by the Mary Lyon Centre and Mammalian Genetics Unit (Medical Research Council, UK Hub grant G0900747 91070 and Medical Research Council, UK grant MC U142684172). D.A.B acknowledges support from NIH R01AR056280. The sleep work was supported by the state of Vaud (Switzerland) and the Swiss National Science Foundation (SNF 14694 and 136201 to P.F.). The ECG work was supported by the Netherlands CardioVascular Research Initiative (Dutch Heart Foundation, Dutch Federation of University Medical Centres, the Netherlands Organization for Health Research and Development, and the Royal Netherlands Academy of Sciences) PREDICT project, InterUniversity Cardiology Institute of the Netherlands (ICIN; 061.02; C.A.R., C.R.B). Na Cai is supported by the Agency of Science, Technology and Research (A*STAR) Graduate Academy. The authors wish to acknowledge excellent technical assistance from: Ayako Kurioka, Leo Swadling, Catherine de Lara, James Ussher, Rachel Townsend, Sima Lionikaite, Ausra S. Lionikiene, Rianne Wolswinkel and Inge van der Made. We would like to thank Thomas M Keane and Anthony G Doran for their help in annotating variants and adding the FVB/NJ strain to the Mouse Genomes Project.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Human use of the oceans is increasingly in conflict with conservation of endangered species. Methods for managing the spatial and temporal placement of industries such as military, fishing, transportation and offshore energy, have historically been post hoc; i.e. the time and place of human activity is often already determined before assessment of environmental impacts. In this dissertation, I build robust species distribution models in two case study areas, US Atlantic (Best et al. 2012) and British Columbia (Best et al. 2015), predicting presence and abundance respectively, from scientific surveys. These models are then applied to novel decision frameworks for preemptively suggesting optimal placement of human activities in space and time to minimize ecological impacts: siting for offshore wind energy development, and routing ships to minimize risk of striking whales. Both decision frameworks relate the tradeoff between conservation risk and industry profit with synchronized variable and map views as online spatial decision support systems.

For siting offshore wind energy development (OWED) in the U.S. Atlantic (chapter 4), bird density maps are combined across species with weights of OWED sensitivity to collision and displacement and 10 km2 sites are compared against OWED profitability based on average annual wind speed at 90m hub heights and distance to transmission grid. A spatial decision support system enables toggling between the map and tradeoff plot views by site. A selected site can be inspected for sensitivity to a cetaceans throughout the year, so as to capture months of the year which minimize episodic impacts of pre-operational activities such as seismic airgun surveying and pile driving.

Routing ships to avoid whale strikes (chapter 5) can be similarly viewed as a tradeoff, but is a different problem spatially. A cumulative cost surface is generated from density surface maps and conservation status of cetaceans, before applying as a resistance surface to calculate least-cost routes between start and end locations, i.e. ports and entrance locations to study areas. Varying a multiplier to the cost surface enables calculation of multiple routes with different costs to conservation of cetaceans versus cost to transportation industry, measured as distance. Similar to the siting chapter, a spatial decisions support system enables toggling between the map and tradeoff plot view of proposed routes. The user can also input arbitrary start and end locations to calculate the tradeoff on the fly.

Essential to the input of these decision frameworks are distributions of the species. The two preceding chapters comprise species distribution models from two case study areas, U.S. Atlantic (chapter 2) and British Columbia (chapter 3), predicting presence and density, respectively. Although density is preferred to estimate potential biological removal, per Marine Mammal Protection Act requirements in the U.S., all the necessary parameters, especially distance and angle of observation, are less readily available across publicly mined datasets.

In the case of predicting cetacean presence in the U.S. Atlantic (chapter 2), I extracted datasets from the online OBIS-SEAMAP geo-database, and integrated scientific surveys conducted by ship (n=36) and aircraft (n=16), weighting a Generalized Additive Model by minutes surveyed within space-time grid cells to harmonize effort between the two survey platforms. For each of 16 cetacean species guilds, I predicted the probability of occurrence from static environmental variables (water depth, distance to shore, distance to continental shelf break) and time-varying conditions (monthly sea-surface temperature). To generate maps of presence vs. absence, Receiver Operator Characteristic (ROC) curves were used to define the optimal threshold that minimizes false positive and false negative error rates. I integrated model outputs, including tables (species in guilds, input surveys) and plots (fit of environmental variables, ROC curve), into an online spatial decision support system, allowing for easy navigation of models by taxon, region, season, and data provider.

For predicting cetacean density within the inner waters of British Columbia (chapter 3), I calculated density from systematic, line-transect marine mammal surveys over multiple years and seasons (summer 2004, 2005, 2008, and spring/autumn 2007) conducted by Raincoast Conservation Foundation. Abundance estimates were calculated using two different methods: Conventional Distance Sampling (CDS) and Density Surface Modelling (DSM). CDS generates a single density estimate for each stratum, whereas DSM explicitly models spatial variation and offers potential for greater precision by incorporating environmental predictors. Although DSM yields a more relevant product for the purposes of marine spatial planning, CDS has proven to be useful in cases where there are fewer observations available for seasonal and inter-annual comparison, particularly for the scarcely observed elephant seal. Abundance estimates are provided on a stratum-specific basis. Steller sea lions and harbour seals are further differentiated by ‘hauled out’ and ‘in water’. This analysis updates previous estimates (Williams & Thomas 2007) by including additional years of effort, providing greater spatial precision with the DSM method over CDS, novel reporting for spring and autumn seasons (rather than summer alone), and providing new abundance estimates for Steller sea lion and northern elephant seal. In addition to providing a baseline of marine mammal abundance and distribution, against which future changes can be compared, this information offers the opportunity to assess the risks posed to marine mammals by existing and emerging threats, such as fisheries bycatch, ship strikes, and increased oil spill and ocean noise issues associated with increases of container ship and oil tanker traffic in British Columbia’s continental shelf waters.

Starting with marine animal observations at specific coordinates and times, I combine these data with environmental data, often satellite derived, to produce seascape predictions generalizable in space and time. These habitat-based models enable prediction of encounter rates and, in the case of density surface models, abundance that can then be applied to management scenarios. Specific human activities, OWED and shipping, are then compared within a tradeoff decision support framework, enabling interchangeable map and tradeoff plot views. These products make complex processes transparent for gaming conservation, industry and stakeholders towards optimal marine spatial management, fundamental to the tenets of marine spatial planning, ecosystem-based management and dynamic ocean management.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.

Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A class of multi-process models is developed for collections of time indexed count data. Autocorrelation in counts is achieved with dynamic models for the natural parameter of the binomial distribution. In addition to modeling binomial time series, the framework includes dynamic models for multinomial and Poisson time series. Markov chain Monte Carlo (MCMC) and Po ́lya-Gamma data augmentation (Polson et al., 2013) are critical for fitting multi-process models of counts. To facilitate computation when the counts are high, a Gaussian approximation to the P ́olya- Gamma random variable is developed.

Three applied analyses are presented to explore the utility and versatility of the framework. The first analysis develops a model for complex dynamic behavior of themes in collections of text documents. Documents are modeled as a “bag of words”, and the multinomial distribution is used to characterize uncertainty in the vocabulary terms appearing in each document. State-space models for the natural parameters of the multinomial distribution induce autocorrelation in themes and their proportional representation in the corpus over time.

The second analysis develops a dynamic mixed membership model for Poisson counts. The model is applied to a collection of time series which record neuron level firing patterns in rhesus monkeys. The monkey is exposed to two sounds simultaneously, and Gaussian processes are used to smoothly model the time-varying rate at which the neuron’s firing pattern fluctuates between features associated with each sound in isolation.

The third analysis presents a switching dynamic generalized linear model for the time-varying home run totals of professional baseball players. The model endows each player with an age specific latent natural ability class and a performance enhancing drug (PED) use indicator. As players age, they randomly transition through a sequence of ability classes in a manner consistent with traditional aging patterns. When the performance of the player significantly deviates from the expected aging pattern, he is identified as a player whose performance is consistent with PED use.

All three models provide a mechanism for sharing information across related series locally in time. The models are fit with variations on the P ́olya-Gamma Gibbs sampler, MCMC convergence diagnostics are developed, and reproducible inference is emphasized throughout the dissertation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The six-layered neuron structure in the cerebral cortex is the foundation for human mental abilities. In the developing cerebral cortex, neural stem cells undergo proliferation and differentiate into intermediate progenitors and neurons, a process known as embryonic neurogenesis. Disrupted embryonic neurogenesis is the root cause of a wide range of neurodevelopmental disorders, including microcephaly and intellectual disabilities. Multiple layers of regulatory networks have been identified and extensively studied over the past decades to understand this complex but extremely crucial process of brain development. In recent years, post-transcriptional RNA regulation through RNA binding proteins has emerged as a critical regulatory nexus in embryonic neurogenesis. The exon junction complex (EJC) is a highly conserved RNA binding complex composed of four core proteins, Magoh, Rbm8a, Eif4a3, and Casc3. The EJC plays a major role in regulating RNA splicing, nuclear export, subcellular localization, translation, and nonsense mediated RNA decay. Human genetic studies have associated individual EJC components with various developmental disorders. We showed previously that haploinsufficiency of Magoh causes microcephaly and disrupted neural stem cell differentiation in mouse. However, it is unclear if other EJC core components are also required for embryonic neurogenesis. More importantly, the molecular mechanism through which the EJC regulates embryonic neurogenesis remains largely unknown. Here, we demonstrated with genetically modified mouse models that both Rbm8a and Eif4a3 are required for proper embryonic neurogenesis and the formation of a normal brain. Using transcriptome and proteomic analysis, we showed that the EJC posttranscriptionally regulates genes involved in the p53 pathway, splicing and translation regulation, as well as ribosomal biogenesis. This is the first in vivo evidence suggesting that the etiology of EJC associated neurodevelopmental diseases can be ribosomopathies. We also showed that, different from other EJC core components, depletion of Casc3 only led to mild neurogenesis defects in the mouse model. However, our data suggested that Casc3 is required for embryo viability, development progression, and is potentially a regulator of cardiac development. Together, data presented in this thesis suggests that the EJC is crucial for embryonic neurogenesis and that the EJC and its peripheral factors may regulate development in a tissue-specific manner.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background The growing prevalence and associated burden of diet-related non-communicable diseases is a global public health concern. The environments in which people live and work influences their dietary behaviours. Aim The focus of this thesis was on the effectiveness of complex workplace dietary interventions. The comparative effectiveness of a complex workplace environmental dietary modification intervention and an educational intervention were assessed both alone and in combination relative to a control workplace setting. Methods The systematic review was guided by the PRISMA statement. In a cluster controlled trial, four workplaces were purposively allocated to control, nutrition education alone (Education), environmental dietary modification alone (Environment) and nutrition education and environmental dietary modification (Combined intervention). The interventions were guided by the MRC framework. In the control workplace, data were collected at baseline and follow-up. In the intervention related sub-study, the relationships between nutrition knowledge, diet quality and hypertension were examined. Results The systematic review provided limited evidence. In the FCW study, 850 employees aged 18-64 years were recruited at baseline with N(response rate %) in each workplace as follows: Control: 111(72%), Education: 226(71%), Environment: 113(91%), Combined intervention: 400(61%). Complete follow-up data was obtained for 517 employees (61%). There were significant positive changes in dietary intakes of saturated fat(p=0.013), salt(p=0.010) and nutrition knowledge(p=0.034) between baseline and follow-up at 7-9 months in the combined intervention versus the control workplace in the fully adjusted multivariate analysis. Small but significant changes in BMI(-1.2kg/m2 (p=0.047) were also observed in the combined intervention. In the sub-study, nutrition knowledge was positively significantly associated with diet quality and blood pressure but no evidence of a mediation effect of the DASH score was detected between nutrition knowledge and blood pressure. Conclusion This thesis provides critical evidence on the effectiveness of complex workplace dietary interventions in a manufacturing working population.