Biblioteca Digital

15 resultados para Large Data Sets

em Helda - Digital Repository of University of Helsinki

Discovery of frequent patterns in large data collections

Relevância:

100.00% 100.00%

Publicador:

A metabolomic approach to studying mechanisms of polymeric gene delivery to retinal pigment epithelial cells

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tiivistelmä ReferatAbstract Metabolomics is a rapidly growing research field that studies the response of biological systems to environmental factors, disease states and genetic modifications. It aims at measuring the complete set of endogenous metabolites, i.e. the metabolome, in a biological sample such as plasma or cells. Because metabolites are the intermediates and end products of biochemical reactions, metabolite compositions and metabolite levels in biological samples can provide a wealth of information on on-going processes in a living system. Due to the complexity of the metabolome, metabolomic analysis poses a challenge to analytical chemistry. Adequate sample preparation is critical to accurate and reproducible analysis, and the analytical techniques must have high resolution and sensitivity to allow detection of as many metabolites as possible. Furthermore, as the information contained in the metabolome is immense, the data set collected from metabolomic studies is very large. In order to extract the relevant information from such large data sets, efficient data processing and multivariate data analysis methods are needed. In the research presented in this thesis, metabolomics was used to study mechanisms of polymeric gene delivery to retinal pigment epithelial (RPE) cells. The aim of the study was to detect differences in metabolomic fingerprints between transfected cells and non-transfected controls, and thereafter to identify metabolites responsible for the discrimination. The plasmid pCMV-β was introduced into RPE cells using the vector polyethyleneimine (PEI). The samples were analyzed using high performance liquid chromatography (HPLC) and ultra performance liquid chromatography (UPLC) coupled to a triple quadrupole (QqQ) mass spectrometer (MS). The software MZmine was used for raw data processing and principal component analysis (PCA) was used in statistical data analysis. The results revealed differences in metabolomic fingerprints between transfected cells and non-transfected controls. However, reliable fingerprinting data could not be obtained because of low analysis repeatability. Therefore, no attempts were made to identify metabolites responsible for discrimination between sample groups. Repeatability and accuracy of analyses can be influenced by protocol optimization. However, in this study, optimization of analytical methods was hindered by the very small number of samples available for analysis. In conclusion, this study demonstrates that obtaining reliable fingerprinting data is technically demanding, and the protocols need to be thoroughly optimized in order to approach the goals of gaining information on mechanisms of gene delivery.

Computational Methods for Detecting Large-Scale Chromosome Rearrangements in SNP Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Large-scale chromosome rearrangements such as copy number variants (CNVs) and inversions encompass a considerable proportion of the genetic variation between human individuals. In a number of cases, they have been closely linked with various inheritable diseases. Single-nucleotide polymorphisms (SNPs) are another large part of the genetic variance between individuals. They are also typically abundant and their measuring is straightforward and cheap. This thesis presents computational means of using SNPs to detect the presence of inversions and deletions, a particular variety of CNVs. Technically, the inversion-detection algorithm detects the suppressed recombination rate between inverted and non-inverted haplotype populations whereas the deletion-detection algorithm uses the EM-algorithm to estimate the haplotype frequencies of a window with and without a deletion haplotype. As a contribution to population biology, a coalescent simulator for simulating inversion polymorphisms has been developed. Coalescent simulation is a backward-in-time method of modelling population ancestry. Technically, the simulator also models multiple crossovers by using the Counting model as the chiasma interference model. Finally, this thesis includes an experimental section. The aforementioned methods were tested on synthetic data to evaluate their power and specificity. They were also applied to the HapMap Phase II and Phase III data sets, yielding a number of candidates for previously unknown inversions, deletions and also correctly detecting known such rearrangements.

Discovering hidden structures in molecular data using a Bayesian partition model approach

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Advancements in the analysis techniques have led to a rapid accumulation of biological data in databases. Such data often are in the form of sequences of observations, examples including DNA sequences and amino acid sequences of proteins. The scale and quality of the data give promises of answering various biologically relevant questions in more detail than what has been possible before. For example, one may wish to identify areas in an amino acid sequence, which are important for the function of the corresponding protein, or investigate how characteristics on the level of DNA sequence affect the adaptation of a bacterial species to its environment. Many of the interesting questions are intimately associated with the understanding of the evolutionary relationships among the items under consideration. The aim of this work is to develop novel statistical models and computational techniques to meet with the challenge of deriving meaning from the increasing amounts of data. Our main concern is on modeling the evolutionary relationships based on the observed molecular data. We operate within a Bayesian statistical framework, which allows a probabilistic quantification of the uncertainties related to a particular solution. As the basis of our modeling approach we utilize a partition model, which is used to describe the structure of data by appropriately dividing the data items into clusters of related items. Generalizations and modifications of the partition model are developed and applied to various problems. Large-scale data sets provide also a computational challenge. The models used to describe the data must be realistic enough to capture the essential features of the current modeling task but, at the same time, simple enough to make it possible to carry out the inference in practice. The partition model fulfills these two requirements. The problem-specific features can be taken into account by modifying the prior probability distributions of the model parameters. The computational efficiency stems from the ability to integrate out the parameters of the partition model analytically, which enables the use of efficient stochastic search algorithms.

Efficient search for statistically significant dependency rules in binary data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Analyzing statistical dependencies is a fundamental problem in all empirical science. Dependencies help us understand causes and effects, create new scientific theories, and invent cures to problems. Nowadays, large amounts of data is available, but efficient computational tools for analyzing the data are missing. In this research, we develop efficient algorithms for a commonly occurring search problem - searching for the statistically most significant dependency rules in binary data. We consider dependency rules of the form X->A or X->not A, where X is a set of positive-valued attributes and A is a single attribute. Such rules describe which factors either increase or decrease the probability of the consequent A. A classical example are genetic and environmental factors, which can either cause or prevent a disease. The emphasis in this research is that the discovered dependencies should be genuine - i.e. they should also hold in future data. This is an important distinction from the traditional association rules, which - in spite of their name and a similar appearance to dependency rules - do not necessarily represent statistical dependencies at all or represent only spurious connections, which occur by chance. Therefore, the principal objective is to search for the rules with statistical significance measures. Another important objective is to search for only non-redundant rules, which express the real causes of dependence, without any occasional extra factors. The extra factors do not add any new information on the dependence, but can only blur it and make it less accurate in future data. The problem is computationally very demanding, because the number of all possible rules increases exponentially with the number of attributes. In addition, neither the statistical dependency nor the statistical significance are monotonic properties, which means that the traditional pruning techniques do not work. As a solution, we first derive the mathematical basis for pruning the search space with any well-behaving statistical significance measures. The mathematical theory is complemented by a new algorithmic invention, which enables an efficient search without any heuristic restrictions. The resulting algorithm can be used to search for both positive and negative dependencies with any commonly used statistical measures, like Fisher's exact test, the chi-squared measure, mutual information, and z scores. According to our experiments, the algorithm is well-scalable, especially with Fisher's exact test. It can easily handle even the densest data sets with 10000-20000 attributes. Still, the results are globally optimal, which is a remarkable improvement over the existing solutions. In practice, this means that the user does not have to worry whether the dependencies hold in future data or if the data still contains better, but undiscovered dependencies.

The effects of the 2004 reduction in the price of alcohol on alcohol-related harm in Finland : A natural experiment based on register data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Changes in alcohol pricing have been documented as inversely associated with changes in consumption and alcohol-related problems. Evidence of the association between price changes and health problems is nevertheless patchy and is based to a large extent on cross-sectional state-level data, or time series of such cross-sectional analyses. Natural experimental studies have been called for. There was a substantial reduction in the price of alcohol in Finland in 2004 due to a reduction in alcohol taxes of one third, on average, and the abolition of duty-free allowances for travellers from the EU. These changes in the Finnish alcohol policy could be considered a natural experiment, which offered a good opportunity to study what happens with regard to alcohol-related problems when prices go down. The present study investigated the effects of this reduction in alcohol prices on (1) alcohol-related and all-cause mortality, and mortality due to cardiovascular diseases, (2) alcohol-related morbidity in terms of hospitalisation, (3) socioeconomic differentials in alcohol-related mortality, and (4) small-area differences in interpersonal violence in the Helsinki Metropolitan area. Differential trends in alcohol-related mortality prior to the price reduction were also analysed. A variety of population-based register data was used in the study. Time-series intervention analysis modelling was applied to monthly aggregations of deaths and hospitalisation for the period 1996-2006. These and other mortality analyses were carried out for men and women aged 15 years and over. Socioeconomic differentials in alcohol-related mortality were assessed on a before/after basis, mortality being followed up in 2001-2003 (before the price reduction) and 2004-2005 (after). Alcohol-related mortality was defined in all the studies on mortality on the basis of information on both underlying and contributory causes of death. Hospitalisation related to alcohol meant that there was a reference to alcohol in the primary diagnosis. Data on interpersonal violence was gathered from 86 administrative small-areas in the Helsinki Metropolitan area and was also assessed on a before/after basis followed up in 2002-2003 and 2004-2005. The statistical methods employed to analyse these data sets included time-series analysis, and Poisson and linear regression. The results of the study indicate that alcohol-related deaths increased substantially among men aged 40-69 years and among women aged 50-69 after the price reduction when trends and seasonal variation were taken into account. The increase was mainly attributable to chronic causes, particularly liver diseases. Mortality due to cardiovascular diseases and all-cause mortality, on the other hand, decreased considerably among the-over-69-year-olds. The increase in alcohol-related mortality in absolute terms among the 30-59-year-olds was largest among the unemployed and early-age pensioners, and those with a low level of education, social class or income. The relative differences in change between the education and social class subgroups were small. The employed and those under the age of 35 did not suffer from increased alcohol-related mortality in the two years following the price reduction. The gap between the age and education groups, which was substantial in the 1980s, thus further broadened. With regard to alcohol-related hospitalisation, there was an increase in both chronic and acute causes among men under the age of 70, and among women in the 50-69-year age group when trends and seasonal variation were taken into account. Alcohol dependence and other alcohol-related mental and behavioural disorders were the largest category in both the total number of chronic hospitalisation and in the increase. There was no increase in the rate of interpersonal violence in the Helsinki Metropolitan area, and even a decrease in domestic violence. There was a significant relationship between the measures of social disadvantage on the area level and interpersonal violence, although the differences in the effects of the price reduction between the different areas were small. The findings of the present study suggest that that a reduction in alcohol prices may lead to a substantial increase in alcohol-related mortality and morbidity. However, large population group differences were observed regarding responsiveness to the price changes. In particular, the less privileged, such as the unemployed, were most sensitive. In contrast, at least in the Finnish context, the younger generations and the employed do not appear to be adversely affected, and those in the older age groups may even benefit from cheaper alcohol in terms of decreased rates of CVD mortality. The results also suggest that reductions in alcohol prices do not necessarily affect interpersonal violence. The population group differences in the effects of the price changes on alcohol-related harm should be acknowledged, and therefore the policy actions should focus on the population subgroups that are primarily responsive to the price reduction.

Obesity, Smoking and Dieting

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Lihavuus ja ylipaino ovat viime vuosikymmeninä yleistyneet; jo yli puolet länsimaiden väestöstä on ylipainoisia ja viidennes lihavia. Varsinkin nuorilla ylipainon lisääntyminen on ollut nopeaa. Ylipaino, erityisesti yhdistettynä vyötärölihavuuteen, sekä tupakointi lisäävät sairastavuutta sydän- ja verisuonisairauksiin, metabolisiin sairauksiin, kuten diabetekseen, sekä moniin syöpiin. Lihavuus ja tupakointi ovatkin kehittyneiden maiden tärkeimpiä ehkäistävissä olevia kuolinsyitä. Samanaikaisesti ylipainon kanssa laihduttaminen ja jopa terveydelle haitalliset laihdutusmenetelmät, kuten tupakointi painonhallintakeinona on tullut yhä yleisemmäksi. Nopeaan painonpudotukseen tähtäävällä laihduttamisella on usein terveydelle haitallisia seurauksia kuten painon nousu yli alkuperäisen painon ja kehon rasvajakauman muuttuminen epäterveellisemmäksi. Kolme neljännestä merkittävästi laihduttaneista kertoo painon nousseen takaisin. Tupakoinnin ja toistuvan laihduttamisen vaikutukset ylipainon ja lihavuuden kehittymiselle kytkeytyvät toisiinsa. Tässä väitöskirjatyössä tutkittiin toistuvan laihduttamisen ja tupakoinnin vaikutusta kehon painoon ja lisäksi tupakoinnin vaikutusta vyötärölihavuuden kehittymiseen. Työn toisena tavoitteena oli tutkia, kuinka voimakkaasti tupakointi ja toistuva laihduttaminen liittyvät toisiinsa suomalaisilla ja onko tämä yhteys erilainen eri ikäryhmissä ja sukupuolilla. Työ perustuu kolmeen laajaan kyselyaineistoon: Nuorten Kaksosten Terveystutkimuksen (englanniksi FinnTwin16) aineistossa on seurattu 1975-79 syntyneitä kaksosia 16, 17, 18 ja 24 vuoden ikäisinä (N=5563). Suomen kaksoskohortin aineisto (N= 12 793) on kerätty vuonna 1990 samaa sukupuolta olevilta, vuosina 1930-57 syntyneiltä kaksosilta. Entisten huippu-urheilijoiden (N=1838) ja heille kaltaistettujen verrokkien (N=834) seurantatutkimuksessa tiedot on kerätty vuosina 1985, 1995 ja 2001. Pituus, paino ja tupakointi on kysytty kaikissa kyselyissä. Kaksoset vastasivat laihdutuskäyttäytymistä koskeviin kysymyksiin. Urheilijoiden laihdutuskäyttäytyminen pääteltiin lajin perusteella, sillä toistuvan laihduttamisen tiedetään olevan yleistä painoluokissa urheilevilla urheilijoilla (esim.painijat, nyrkkeilijät). Nuoruusiän tupakointi ennusti vyötärölihavuutta molemmilla sukupuolilla ja lisäksi ylipainoisuutta naisilla. Toistuva laihduttaminen oli yhteydessä myöhempään painonnousuun ja lihavuuteen miehillä. Lisäksi toistuvan laihduttamisen ja tupakoinnin todettiin liittyvän toisiinsa nuorilla aikuisilla. Vanhemmissa ikäluokissa miehet, jotka tupakoivat, laihduttivat harvemmin kuin tupakoimattomat. Lihavuuteen ja vyötärölihavuuteen liittyvän oheissairastavuuden ennaltaehkäisyssä tupakoinnin ja toistuvan laihduttamisen vähentäminen saattavat olla aiemmin luultua tehokkaampia keinoja.

Options for selecting dairy cattle for milk coagulation ability

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Lypsylehmien maidon juoksettumiskyvyn jalostuskeinot Väitöskirjassa tutkittiin lypsylehmien maidon juustonvalmistuslaadun parantamista jalostusvalinnan avulla. Tutkimusaihe on tärkeä, sillä yhä suurempi osa maidosta käytetään juustonvalmistukseen. Tutkimuksen kohteena oli maidon juoksettumiskyky, sillä se on yksi keskeisistä juustomäärään vaikuttavista tekijöistä. Maidon juoksettumiskyky vaihteli huomattavasti lehmien, sonnien, karjojen, rotujen ja lypsykauden vaiheiden välillä. Vaikka tankkimaidon juoksettumiskyvyssä olikin suuria eroja karjoittain, karja selitti vain pienen osan juoksettumiskyvyn kokonaisvaihtelusta. Todennäköisesti perinnölliset erot lehmien välillä selittävät suurimman osan karjojen tankkimaitojen juoksettumiskyvyssä havaituista eroista. Hyvä hoito ja ruokinta vähensivät kuitenkin jossain määrin huonosti juoksettuvien tankkimaitojen osuutta karjoissa. Holstein-friisiläiset lehmät olivat juoksettumiskyvyltään ayrshire-rotuisia lehmiä parempia. Huono juoksettuminen ja juoksettumattomuus oli vain vähäinen ongelma holstein-friisiläisillä (10 %), kun taas kolmannes ayrshire-lehmistä tuotti huonosti juoksettuvaa tai juoksettumatonta maitoa. Maitoa sanotaan huonosti juoksettuvaksi silloin, kun juustomassa ei ole riittävän kiinteää leikattavaksi puolen tunnin kuluttua juoksetteen lisäyksestä. Juoksettumattomaksi määriteltävä maito ei saostu lainkaan puolen tunnin aikana ja on siksi erittäin huonoa raaka-ainetta juustomeijereille. Noin 40 % lehmien välisistä eroista maidon juoksettumiskyvyssä selittyi perinnöllisillä tekijöillä. Juoksettumiskykyä voikin sanoa hyvin periytyväksi ominaisuudeksi. Kolme mittauskertaa lehmää kohti riittää varsin hyvin lehmän maidon keskimääräisen juoksettumiskyvyn arvioimiseen. Tällä hetkellä juoksettumiskyvyn suoran jalostamisen ongelmana on kuitenkin automatisoidun, laajamittaiseen käyttöön soveltuvan mittalaitteen puute. Tämän takia väitöskirjassa tutkittiin mahdollisuuksia jalostaa maidon juoksettumiskykyä epäsuorasti, jonkin toisen ominaisuuden kautta. Tällaisen ominaisuuden pitää olla kyllin voimakkaasti perinnöllisesti kytkeytynyt juoksettumiskykyyn, jotta jalostus olisi mahdollista sen avulla. Tutkittavat ominaisuudet olivat sonnien kokonaisjalostusarvossa jo mukana olevat maitotuotos ja utareterveyteen liittyvät ominaisuudet sekä kokonaisjalostusarvoon kuulumattomat maidon valkuais- ja kaseiinipitoisuus sekä maidon pH. Väitöskirjassa tutkittiin myös mahdollisuuksia ns. merkkiavusteiseen valintaan tutkimalla maidon juoksettumattomuuden perinnöllisyyttä ja kartoittamalla siihen liittyvät kromosomialueet. Tutkimuksen tulosten perusteella lehmien utareterveyden jalostaminen parantaa jonkin verran myös maidon juoksettumiskykyä sekä vähentää juoksettumattomuutta ayrshire-rotuisilla lehmillä. Lehmien maitotuotos ja maidon juoksettumiskyky sekä juoksettumattomuus ovat sen sijaan perinnöllisesti toisistaan riippumattomia ominaisuuksia. Myöskin maidon valkuais- ja kaseiinipitoisuuden perinnöllinen yhteys juoksettumiskykyyn oli likimain nolla. Maidon pH:n ja juoksettumiskyvyn välillä oli melko voimakas perinnöllinen yhteys, joten maidon pH:n jalostaminen parantaisi myös maidon juoksettumiskykyä. Todennäköisesti sen jalostaminen ei kuitenkaan vähentäisi juoksettumatonta maitoa tuottavien lehmien määrää. Koska maidon juoksettumattomuus on niin yleinen ongelma suomalaisilla ayrshire-lehmillä, väitöksessä selvitettiin tarkemmin ilmiön taustoja. Kaikissa kolmessa tutkimusaineistoissa noin 10 % ayrshire-lehmistä tuotti juoksettumatonta maitoa. Kahden vuoden kuukausittaisen seurannan aikana osa lehmistä tuotti juoksettumatonta maitoa lähes joka mittauskerralla. Maidon juoksettumattomuus oli yhteydessä lypsykauden vaiheeseen, mutta mikään ympäristötekijöistä ei pystynyt täysin selittämään sitä. Sen sijaan viitteet sen periytyvyydestä vahvistuivat tutkimusten edetessä. Lopuksi tutkimusryhmä onnistui kartoittamaan juoksettumattomuutta aiheuttavat kromosomialueet kromosomeihin 2 ja 18, lähelle DNA-merkkejä BMS1126 ja BMS1355. Tulosten perusteella maidon juoksettumattomuus ei ole yhteydessä maidon juoksettumistapahtumassa keskeisiin kaseiinigeeneihin. Sen sijaan on mahdollista, että juoksettumattomuusongelman aiheuttavat kaseiinigeenien syntetisoinnin jälkeisessä muokkauksessa tapahtuvat virheet. Asia vaatii kuitenkin perusteellista tutkimista. Väitöksen tulosten perusteella maidon juoksettumattomuusgeeniä kantavien eläinten karsiminen jalostuseläinten joukosta olisi tehokkain tapa jalostaa maidon juoksettumiskykyä suomalaisessa lypsykarjapopulaatiossa.

Remote sensing of boreal land cover: estimation of forest attributes and extent

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Remote sensing provides methods to infer land cover information over large geographical areas at a variety of spatial and temporal resolutions. Land cover is input data for a range of environmental models and information on land cover dynamics is required for monitoring the implications of global change. Such data are also essential in support of environmental management and policymaking. Boreal forests are a key component of the global climate and a major sink of carbon. The northern latitudes are expected to experience a disproportionate and rapid warming, which can have a major impact on vegetation at forest limits. This thesis examines the use of optical remote sensing for estimating aboveground biomass, leaf area index (LAI), tree cover and tree height in the boreal forests and tundra taiga transition zone in Finland. The continuous fields of forest attributes are required, for example, to improve the mapping of forest extent. The thesis focus on studying the feasibility of satellite data at multiple spatial resolutions, assessing the potential of multispectral, -angular and -temporal information, and provides regional evaluation for global land cover data. Preprocessed ASTER, MISR and MODIS products are the principal satellite data. The reference data consist of field measurements, forest inventory data and fine resolution land cover maps. Fine resolution studies demonstrate how statistical relationships between biomass and satellite data are relatively strong in single species and low biomass mountain birch forests in comparison to higher biomass coniferous stands. The combination of forest stand data and fine resolution ASTER images provides a method for biomass estimation using medium resolution MODIS data. The multiangular data improve the accuracy of land cover mapping in the sparsely forested tundra taiga transition zone, particularly in mires. Similarly, multitemporal data improve the accuracy of coarse resolution tree cover estimates in comparison to single date data. Furthermore, the peak of the growing season is not necessarily the optimal time for land cover mapping in the northern boreal regions. The evaluated coarse resolution land cover data sets have considerable shortcomings in northernmost Finland and should be used with caution in similar regions. The quantitative reference data and upscaling methods for integrating multiresolution data are required for calibration of statistical models and evaluation of land cover data sets. The preprocessed image products have potential for wider use as they can considerably reduce the time and effort used for data processing.

Bayesian statistical analysis of bacterial diversity

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Bacteria play an important role in many ecological systems. The molecular characterization of bacteria using either cultivation-dependent or cultivation-independent methods reveals the large scale of bacterial diversity in natural communities, and the vastness of subpopulations within a species or genus. Understanding how bacterial diversity varies across different environments and also within populations should provide insights into many important questions of bacterial evolution and population dynamics. This thesis presents novel statistical methods for analyzing bacterial diversity using widely employed molecular fingerprinting techniques. The first objective of this thesis was to develop Bayesian clustering models to identify bacterial population structures. Bacterial isolates were identified using multilous sequence typing (MLST), and Bayesian clustering models were used to explore the evolutionary relationships among isolates. Our method involves the inference of genetic population structures via an unsupervised clustering framework where the dependence between loci is represented using graphical models. The population dynamics that generate such a population stratification were investigated using a stochastic model, in which homologous recombination between subpopulations can be quantified within a gene flow network. The second part of the thesis focuses on cluster analysis of community compositional data produced by two different cultivation-independent analyses: terminal restriction fragment length polymorphism (T-RFLP) analysis, and fatty acid methyl ester (FAME) analysis. The cluster analysis aims to group bacterial communities that are similar in composition, which is an important step for understanding the overall influences of environmental and ecological perturbations on bacterial diversity. A common feature of T-RFLP and FAME data is zero-inflation, which indicates that the observation of a zero value is much more frequent than would be expected, for example, from a Poisson distribution in the discrete case, or a Gaussian distribution in the continuous case. We provided two strategies for modeling zero-inflation in the clustering framework, which were validated by both synthetic and empirical complex data sets. We show in the thesis that our model that takes into account dependencies between loci in MLST data can produce better clustering results than those methods which assume independent loci. Furthermore, computer algorithms that are efficient in analyzing large scale data were adopted for meeting the increasing computational need. Our method that detects homologous recombination in subpopulations may provide a theoretical criterion for defining bacterial species. The clustering of bacterial community data include T-RFLP and FAME provides an initial effort for discovering the evolutionary dynamics that structure and maintain bacterial diversity in the natural environment.

Aerosol size distribution and its connection to cloud droplet number concentration

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In order to predict the current state and future development of Earth s climate, detailed information on atmospheric aerosols and aerosol-cloud-interactions is required. Furthermore, these interactions need to be expressed in such a way that they can be represented in large-scale climate models. The largest uncertainties in the estimate of radiative forcing on the present day climate are related to the direct and indirect effects of aerosol. In this work aerosol properties were studied at Pallas and Utö in Finland, and at Mount Waliguan in Western China. Approximately two years of data from each site were analyzed. In addition to this, data from two intensive measurement campaigns at Pallas were used. The measurements at Mount Waliguan were the first long term aerosol particle number concentration and size distribution measurements conducted in this region. They revealed that the number concentration of aerosol particles at Mount Waliguan were much higher than those measured at similar altitudes in other parts of the world. The particles were concentrated in the Aitken size range indicating that they were produced within a couple of days prior to reaching the site, rather than being transported over thousands of kilometers. Aerosol partitioning between cloud droplets and cloud interstitial particles was studied at Pallas during the two measurement campaigns, First Pallas Cloud Experiment (First PaCE) and Second Pallas Cloud Experiment (Second PaCE). The method of using two differential mobility particle sizers (DMPS) to calculate the number concentration of activated particles was found to agree well with direct measurements of cloud droplet. Several parameters important in cloud droplet activation were found to depend strongly on the air mass history. The effects of these parameters partially cancelled out each other. Aerosol number-to-volume concentration ratio was studied at all three sites using data sets with long time-series. The ratio was found to vary more than in earlier studies, but less than either aerosol particle number concentration or volume concentration alone. Both air mass dependency and seasonal pattern were found at Pallas and Utö, but only seasonal pattern at Mount Waliguan. The number-to-volume concentration ratio was found to follow the seasonal temperature pattern well at all three sites. A new parameterization for partitioning between cloud droplets and cloud interstitial particles was developed. The parameterization uses aerosol particle number-to-volume concentration ratio and aerosol particle volume concentration as the only information on the aerosol number and size distribution. The new parameterization is computationally more efficient than the more detailed parameterizations currently in use, but the accuracy of the new parameterization was slightly lower. The new parameterization was also compared to directly observed cloud droplet number concentration data, and a good agreement was found.

Data fusion and matching by maximizing statistical dependencies

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The core aim of machine learning is to make a computer program learn from the experience. Learning from data is usually defined as a task of learning regularities or patterns in data in order to extract useful information, or to learn the underlying concept. An important sub-field of machine learning is called multi-view learning where the task is to learn from multiple data sets or views describing the same underlying concept. A typical example of such scenario would be to study a biological concept using several biological measurements like gene expression, protein expression and metabolic profiles, or to classify web pages based on their content and the contents of their hyperlinks. In this thesis, novel problem formulations and methods for multi-view learning are presented. The contributions include a linear data fusion approach during exploratory data analysis, a new measure to evaluate different kinds of representations for textual data, and an extension of multi-view learning for novel scenarios where the correspondence of samples in the different views or data sets is not known in advance. In order to infer the one-to-one correspondence of samples between two views, a novel concept of multi-view matching is proposed. The matching algorithm is completely data-driven and is demonstrated in several applications such as matching of metabolites between humans and mice, and matching of sentences between documents in two languages.

Growing up in Affiliation with a Religious Community : A Case Study of Seventh-day Adventist Youth in Finland

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This study investigates how the religious community as a socialization context affects the development of young people's religious identity and values, using Finnish Seventh-day Adventism as a context for the case study. The research problem is investigated through the following questions: (1) What aspects support the intergenerational transmission of values and tradition in religious home education? (2) What is the role of social capital and the social networks of the religious community in the religious socialization process? (3) How does the religious composition of the peer group at school (e.g., a denominational school in comparison to a mainstream school) affect these young people s social relations and choices and their religious identity (as challenged versus as reinforced by values at school)? And (4) How do the young people studied negotiate their religious values and religious membership in the diverse social contexts of the society at large? The mixed method study includes both quantitative and qualitative data sets (3 surveys: n=106 young adults, n=100 teenagers, n=55 parents; 2 sets of interviews: n=10 young adults and n=10 teenagers; and fieldwork data from youth summer camps). The results indicate that, in religious home education, the relationship between parents and children, the parental example of a personally meaningful way of life, and encouraging critical thinking in order for young people to make personalized value choices were important factors in socialization. Overall, positive experiences of the religion and the religious community were crucial in providing direction for later choices of values and affiliations. Education that was experienced as either too severe or too permissive was not regarded as a positive influence for accepting similar values and lifestyle choices to those of the parents. Furthermore, the religious community had an important influence on these young people s religious socialization in terms of the commitment to denominational values and lifestyle and in providing them with religious identity and rooting them in the social network of the denomination. The network of the religious community generated important social resources, or social capital, for both the youth and their families, involving both tangible and intangible benefits, and bridging and bonding effects. However, the study also illustrates the sometimes difficult negotiations the youth face in navigating between differentiation and belonging when there is a tension between the values of a minority group and the larger society, and one wants to and does belong to both. It also demonstrates the variety within both the majority and the minority communities in society, as well as the many different ways one can find a personally meaningful way of being an Adventist. In the light of the previous literature about socialization-in-context in an increasingly pluralistic society, the findings were examined at four levels: individual, family, community and societal. These were seen as both a nested structure and as constructing a funnel in which each broader level directs the influences that reach the narrower ones. The societal setting directs the position and operation of religious communities, families and individuals, and the influences that reach the developing children and young people are in many ways directed by societal, communal and family characteristics. These levels are by nature constantly changing, as well as being constructed of different parts, like the pieces of a jigsaw puzzle, each of which alters in significance: for some negotiations on values and memberships the parental influence may be greater, whereas for others the peer group influences are. Although agency does remain somewhat connected to others, the growing youth are gradually able to take more responsibility for their own choices and their agency plays a crucial role in the process of choosing values and group memberships. Keywords: youth, community, Adventism, socialization, values, identity negotiations

Perinatal mortality and maternal health in rural China

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background and context Since the economic reforms of 1978, China has been acclaimed as a remarkable economy, achieving 9% annual growth per head for more than 25 years. However, China's health sector has not fared well. The population health gains slowed down and health disparities increased. In the field of health and health care, significant progress in maternal care has been achieved. However, there still remain important disparities between the urban and rural areas and among the rural areas in terms of economic development. The excess female infant deaths and the rapidly increasing sex ratio at birth in the last decade aroused serious concerns among policy makers and scholars. Decentralization of the government administration and health sector reform impacts maternal care. Many studies using census data have been conducted to explore the determinants of a high sex ratio at birth, but no agreement has been so far reached on the possible contributing factors. No study using family planning system data has been conducted to explore perinatal mortality and sex ratio at birth and only few studies have examined the impact of the decentralization of government and health sector reforms on the provision and organization of maternal care in rural China. Objectives The general objective of this study was to investigate the state of perinatal health and maternal care and their determinants in rural China under the historic context of major socioeconomic reforms and the one child family planning policy. The specific objectives of the study included: 1) to study pregnancy outcomes and perinatal health and their correlates in a rural Chinese county; 2) to examine the issue of sex ratio at birth and its determinants in a rural Chinese county; 3) to explore the patterns of provision, utilization, and content of maternal care in a rural Chinese county; 4) to investigate the changes in the use of maternal care in China from 1991 to 2003. Materials and Methods This study is based on a project for evaluating the prenatal care programme in Dingyuan county in 1999-2003, Anhui province, China and a nationwide household health survey to describe the changes in maternal care utilization. The approaches used included a retrospective cohort study, cross sectional interview surveys, informant interviews, observations and the use of statistical data. The data sources included the following: 1) A cohort of pregnant women followed from pregnancy up to 7 days after birth in 20 townships in the study county, collecting information on pregnancy outcomes using family planning records; 2) A questionnaire interview survey given to women who gave birth between 2001 and 2003; 3) Various statistical and informant surveys data collected from the study county; 4) Three national household health interview survey data sets (1993-2003) were utilized, and reanalyzed to described the changes in maternity care utilization. Relative risks (RR) and their confidence intervals (CI) were calculated for comparison between parity, approval status, infant sex and township groups. The chi-square test was used to analyse the disparity of use of maternal care between and within urban and rural areas and its trend across the years in China. Logistic regression was used to analyse the factors associated with hospital delivery in rural areas. Results There were 3697 pregnancies in the study cohort, resulting in 3092 live births in a total population of 299463 in the 20 study townships during 1999-2000. The average age at pregnancy in the cohort was 25.9 years. Of the women, 61% were childless, 38% already had one child and 0.3% had two children before the current pregnancy. About 90% of approved pregnancies ended in a live birth while 73% of the unapproved ones were aborted. The perinatal mortality rate was 69 per thousand births. If the 30 induced abortions in which the gestational age was more than 28 weeks had been counted as perinatal deaths, the perinatal mortality rate would have been as high as 78 per thousand. The perinatal mortality rate was negatively associated with the wealth of the township. Approximately two thirds of the perinatal deaths occurred in the early neonatal period. Both the still birth rate and the early neonatal death rate increased with parity. The risk of a stillbirth in a second pregnancy was almost four times that for a first pregnancy, while the risk of early neonatal deaths doubled. The early neonatal mortality rate was twice as high for female as for male infants. The sex difference in the early neonatal mortality rate was mainly attributable to mortality in second births. The male early neonatal mortality rate was not affected by parity, while the female early neonatal mortality rate increased dramatically with parity: it was about six times higher for second births than for first births. About 82% early neonatal deaths happened within 24 hours after birth, and during that time, girls were almost three times more likely to die than boys. The death rate of females on the day of birth increased much more sharply with parity than that of males. The total sex ratio at birth of 3697 registered pregnancies was 152 males to 100 females, with 118 and 287 in first and second pregnancies, respectively. Among unapproved pregnancies, there were almost 5 live-born boys for each girl. Most prenatal and delivery care was to be taken care of in township hospitals. At the village level, there were small private clinics. There was no limitation period for the provision of prenatal and postnatal care by private practitioners. They were not permitted to provide delivery care by the county health bureau, but as some 12% of all births occurred either at home or at private clinics; some village health workers might have been involved. The county level hospitals served as the referral centers for the township hospitals in the county. However, there was no formal regulation or guideline on how the referral system should work. Whether or not a woman was referred to a higher level hospital depended on the individual midwife's professional judgment and on the clients' compliance. The county health bureau had little power over township hospitals, because township hospitals had in the decentralization process become directly accountable to the township government. In the township and county hospitals only 10-20% of the recurrent costs were funded by local government (the township hospital was funded by the township government and the county hospital was funded by the county government) and the hospitals collected user fees to balance their budgets. Also the staff salaries depended on fee incomes by the hospital. The hospitals could define the user charges themselves. Prenatal care consultations were however free in most township hospitals. None of the midwives made postnatal home visits, because of low profit of these services. The three national household health survey data showed that the proportion of women receiving their first prenatal visit within 12 weeks increased greatly from the early to middle 1990s in all areas except for large cities. The increase was much larger in the rural areas, reducing the urban-rural difference from more than 4 times to about 1.4 times. The proportion of women that received antenatal care visits meeting the Ministry of Health s standard (at least 5 times) in the rural areas increased sharply from 12% in 1991-1993 to 36% in 2001-2003. In rural areas, the proportion increase was much faster in less developed areas than in developed areas. The hospital delivery rate increased slightly from 90% to 94% in urban areas while the proportion increased from 27% to 69% in rural areas. The fastest change was found to be in type 4 rural areas, where the utilization even quadrupled. The overall difference between rural and urban areas was substantially narrowed over the period. Multiple logistic regression analysis shows that time periods, residency in rural or urban areas, income levels, age group, education levels, delivery history, occupation, health insurance and distance from the nearest health care facilities were significantly associated with hospital delivery rates. Conclusions 1. Perinatal mortality in this study was much higher than that for urban areas as well as any reported rate from specific studies in rural areas of China. Previous studies in which calculations of infant mortality were not based on epidemiological surveys have been shown to underestimate the rates by more than 50%. 2. Routine statistics collected by the Chinese family planning system proved to be a reliable data source for studying perinatal health, including still births, neonatal deaths, sex ratio at birth and among newborns. National Household Health Survey data proved to be a useful and reliable data source for studying population health and health services. Prior to this research there were few studies in these areas available to international audiences. 3.Though perinatal mortality rate was negatively associated with the level of township economic development, the excess female early neonatal mortality rate contributed much more to high perinatal mortality rate than economic factors. This was likely a result of the role of the family planning policy and the traditional preferences for sons, which leads to lethal neglect of female newborns and high perinatal mortality. 4. The selective abortions of female foetuses were likely to contribute most to the high sex ratio at birth. The underreporting of female births seemed to have played a secondary role. The higher early neonatal mortality rate in second-born as compared to first-born children, particularly in females, may indicate that neglect or poorer care of female newborn infants also contributes to the high sex ratio at birth or among newborns. Existing family planning policy proved not to effectively control the steadily increased birth sex ratio. 5. The rural-urban gap in service utilization was on average significantly narrowed in terms of maternal healthcare in China from 1991 to 2003. This demonstrates that significant achievements in reducing inequities can be made through a combination of socio-economic development and targeted investments in improving health services, including infrastructure, staff capacities, and subsidies to reduce the costs of service utilization for the poorest. However, the huge gap which persisted among cities of different size and within different types of rural areas indicated the need for further efforts to support the poorest areas. 6. Hospital delivery care in the study county was better accepted by women because most of women think delivery care was very important while prenatal and postnatal care were not. Hospital delivery care was more systematically provided and promoted than prenatal and postnatal care by township hospital in the study area. The reliance of hospital staff income on user fees gave the hospitals an incentive to put more emphasis on revenue generating activities such as delivery care instead of prenatal and postnatal care, since delivery care generated much profits than prenatal and postnatal care . Recommendations 1. It is essential for the central government to re-assess and modify existing family planning policies. In order to keep national sex balance, the existing practice of one couple one child in urban areas and at-least-one-son a couple in rural areas should be gradually changed to a two-children-a-couple policy throughout the country. The government should establish a favourable social security policy for couples, especially for rural couples who have only daughters, with particular emphasis on their pension and medical care insurance, combined with an educational campaign for equal rights for boys and girls in society. 2. There is currently no routine vital-statistics registration system in rural China. Using the findings of this study, the central government could set up a routine vital-statistics registration system using family planning routine work records, which could be used by policy makers and researchers. 3. It is possible for the central and provincial government to invest more in the less developed and poor rural areas to increase the access of pregnant women in these areas to maternal care services. Central government together with local government should gradually provide free maternal care including prenatal and postnatal as well as delivery care to the women in poor and less developed rural areas. 4. Future research could be done to explore if county and the township level health care sector and the family planning system could be merged to increase the effectiveness and efficiency of maternal and child care. 5. Future research could be done to explore the relative contribution of maternal care, economic development and family planning policy on perinatal and child health using prospective cohort studies and community based randomized trials. Key words: perinatal health, perinatal mortality, stillbirth, neonatal death, sex selective abortion, sex ratio at birth, family planning, son preference, maternal care, prenatal care, postnatal care, equity, China

Discriminative learning of Bayesian networks via factorized conditional log-likelihood

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We propose an efficient and parameter-free scoring criterion, the factorized conditional log-likelihood (ˆfCLL), for learning Bayesian network classifiers. The proposed score is an approximation of the conditional log-likelihood criterion. The approximation is devised in order to guarantee decomposability over the network structure, as well as efficient estimation of the optimal parameters, achieving the same time and space complexity as the traditional log-likelihood scoring criterion. The resulting criterion has an information-theoretic interpretation based on interaction information, which exhibits its discriminative nature. To evaluate the performance of the proposed criterion, we present an empirical comparison with state-of-the-art classifiers. Results on a large suite of benchmark data sets from the UCI repository show that ˆfCLL-trained classifiers achieve at least as good accuracy as the best compared classifiers, using significantly less computational resources.