902 resultados para Large Data Sets


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Volatile chemical compounds responsible for the aroma of wine are derived from a number of different biochemical and chemical pathways. These chemical compounds are formed during grape berry metabolism, crushing of the berries, fermentation processes (i.e. yeast and malolactic bacteria) and also from the ageing and storage of wine. Not surprisingly, there are a large number of chemical classes of compounds found in wine which are present at varying concentrations (ng L-1 to mg L-1), exhibit differing potencies, and have a broad range of volatilities and boiling points. The aim of this work was to investigate the potential use of near infrared (NIR) spectroscopy combined with chemometrics as a rapid and low-cost technique to measure volatile compounds in Riesling wines. Samples of commercial Riesling wine were analyzed using an NIR instrument and volatile compounds by gas chromatography (GC) coupled with selected ion monitoring mass spectrometry. Correlation between the NIR and GC data were developed using partial least-squares (PLS) regression with full cross validation (leave one out). Coefficients of determination in cross validation (R 2) and the standard error in cross validation (SECV) were 0.74 (SECV: 313.6 μg L−1) for esters, 0.90 (SECV: 20.9 μg L−1) for monoterpenes and 0.80 (SECV: 1658 ?g L-1) for short-chain fatty acids. This study has shown that volatile chemical compounds present in wine can be measured by NIR spectroscopy. Further development with larger data sets will be required to test the predictive ability of the NIR calibration models developed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Gaussian processes (GPs) are promising Bayesian methods for classification and regression problems. Design of a GP classifier and making predictions using it is, however, computationally demanding, especially when the training set size is large. Sparse GP classifiers are known to overcome this limitation. In this letter, we propose and study a validation-based method for sparse GP classifier design. The proposed method uses a negative log predictive (NLP) loss measure, which is easy to compute for GP models. We use this measure for both basis vector selection and hyperparameter adaptation. The experimental results on several real-world benchmark data sets show better orcomparable generalization performance over existing methods.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The variability of the sea surface salinity (SSS) in the Indian Ocean is studied using a 100-year control simulation of the Community Climate System Model (CCSM 2.0). The monsoon-driven seasonal SSS pattern in the Indian Ocean, marked by low salinity in the east and high salinity in the west, is captured by the model. The model overestimates runoff int the Bay of Bengal due to higher rainfall over the Himalayan-Tibetan regions which drain into the Bay of Bengal through Ganga-Brahmaputra rivers. The outflow of low-salinity water from the Bay of Bengal is to strong in the model. Consequently, the model Indian Ocean SSS is about 1 less than that seen in the climatology. The seasonal Indian Ocean salt balance obtained from the model is consistent with the analysis from climatological data sets. During summer, the large freshwater input into the Bay of Bengal and its redistribution decide the spatial pattern of salinity tendency. During winter, horizontal advection is the dominant contributor to the tendency term. The interannual variability of the SSS in the Indian Ocean is about five times larger than that in coupled model simulations of the North Atlantic Ocean. Regions of large interannual standard deviations are located near river mouths in the Bay of Bengal and in the eastern equatorial Indian Ocean. Both freshwater input into the ocean and advection of this anomalous flux are responsible for the generation of these anomalies. The model simulates 20 significant Indian Ocean Dipole (IOD) events and during IOD years large salinity anomalies appear in the equatorial Indian Ocean. The anomalies exist as two zonal bands: negative salinity anomalies to the north of the equator and positive to the south. The SSS anomalies for the years in which IOD is not present and for ENSO years are much weaker than during IOD years. Significant interannual SSS anomalies appear in the Indian Ocean only during IOD years.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The use of near infrared (NIR) hyperspectral imaging and hyperspectral image analysis for distinguishing between hard, intermediate and soft maize kernels from inbred lines was evaluated. NIR hyperspectral images of two sets (12 and 24 kernels) of whole maize kernels were acquired using a Spectral Dimensions MatrixNIR camera with a spectral range of 960-1662 nm and a sisuChema SWIR (short wave infrared) hyperspectral pushbroom imaging system with a spectral range of 1000-2498 nm. Exploratory principal component analysis (PCA) was used on absorbance images to remove background, bad pixels and shading. On the cleaned images. PCA could be used effectively to find histological classes including glassy (hard) and floury (soft) endosperm. PCA illustrated a distinct difference between glassy and floury endosperm along principal component (PC) three on the MatrixNIR and PC two on the sisuChema with two distinguishable clusters. Subsequently partial least squares discriminant analysis (PLS-DA) was applied to build a classification model. The PLS-DA model from the MatrixNIR image (12 kernels) resulted in root mean square error of prediction (RMSEP) value of 0.18. This was repeated on the MatrixNIR image of the 24 kernels which resulted in RMSEP of 0.18. The sisuChema image yielded RMSEP value of 0.29. The reproducible results obtained with the different data sets indicate that the method proposed in this paper has a real potential for future classification uses.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This project was designed to provide the structural softwood processing industry with the basis for improved green and dry grading to allow maximise MGP grade yields, consistent product performance and reduced processing costs. To achieve this, advanced statistical techniques were used in conjunction with state-of-the-art property measurement systems. Specifically, the project aimed to make two significant steps forward for the Australian structural softwood industry: • assessment of technologies, both existing and novel, that may lead to selection of a consistent, reliable and accurate device for the log yard and green mill. The purpose is to more accurately identify and reject material that will not make a minimum grade of MGP10 downstream; • improved correlation of grading MOE and MOR parameters in the dry mill using new analytical methods and a combination of devices. The three populations tested were stiffness-limited radiata pine, strength-limited radiata pine and Caribbean pine. Resonance tests were conducted on logs prior to sawmilling, and on boards. Raw data from existing in-line systems were captured for the green and dry boards. The dataset was analysed using classical and advanced statistical tools to provide correlations between data sets and to develop efficient strength and stiffness prediction equations. Stiffness and strength prediction algorithms were developed from raw and combined parameters. Parameters were analysed for comparison of prediction capabilities using in-line parameters, off-line parameters and a combination of in-line and off-line parameters. The results show that acoustic resonance techniques have potential for log assessment, to sort for low stiffness and/or low strength, depending on the resource. From the log measurements, a strong correlation was found between the average static MOE of the dried boards within a log and the predicted value. These results have application in segregating logs into structural and non-structural uses. Some commercial technologies are already available for this application such as Hitman LG640. For green boards it was found that in-line and laboratory acoustic devices can provide a good prediction of dry static MOE and moderate prediction for MOR.There is high potential for segregating boards at this stage of processing. Grading after the log breakdown can improve significantly the effectiveness of the mill. Subsequently, reductions in non-structural volumes can be achieved. Depending on the resource it can be expected that a 5 to 8 % reduction in non structural boards won’t be dried with an associated saving of $70 to 85/m3. For dry boards, vibration and a standard Metriguard CLT/HCLT provided a similar level of prediction on stiffness limited resource. However, Metriguard provides a better strength prediction in strength limited resources (due to this equipment’s ability to measure local characteristics). The combination of grading equipment specifically for stiffness related predictors (Metriguard or vibration) with defect detection systems (optical or X-ray scanner) provides a higher level of prediction, especially for MOR. Several commercial technologies are already available for acoustic grading on board such those from Microtec, Luxscan, Falcon engineering or Dynalyse AB for example. Differing combinations of equipment, and their strategic location within the processing chain, can dramatically improve the efficiency of the mill, the level of which will vary depending of the resource. For example, an initial acoustic sorting on green boards combined with an optical scanner associated with an acoustic system for grading dry board can result in a large reduction of the proportion of low value low non-structural produced. The application of classical MLR on several predictors proved to be effective, in particular for MOR predictions. However, the usage of a modern statistics approach(chemometrics tools) such as PLS proved to be more efficient for improving the level of prediction. Compared to existing technologies, the results of the project indicate a good improvement potential for grading in the green mill, ahead of kiln drying and subsequent cost-adding processes. The next stage is the development and refinement of systems for this purpose.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Lihavuus ja ylipaino ovat viime vuosikymmeninä yleistyneet; jo yli puolet länsimaiden väestöstä on ylipainoisia ja viidennes lihavia. Varsinkin nuorilla ylipainon lisääntyminen on ollut nopeaa. Ylipaino, erityisesti yhdistettynä vyötärölihavuuteen, sekä tupakointi lisäävät sairastavuutta sydän- ja verisuonisairauksiin, metabolisiin sairauksiin, kuten diabetekseen, sekä moniin syöpiin. Lihavuus ja tupakointi ovatkin kehittyneiden maiden tärkeimpiä ehkäistävissä olevia kuolinsyitä. Samanaikaisesti ylipainon kanssa laihduttaminen ja jopa terveydelle haitalliset laihdutusmenetelmät, kuten tupakointi painonhallintakeinona on tullut yhä yleisemmäksi. Nopeaan painonpudotukseen tähtäävällä laihduttamisella on usein terveydelle haitallisia seurauksia kuten painon nousu yli alkuperäisen painon ja kehon rasvajakauman muuttuminen epäterveellisemmäksi. Kolme neljännestä merkittävästi laihduttaneista kertoo painon nousseen takaisin. Tupakoinnin ja toistuvan laihduttamisen vaikutukset ylipainon ja lihavuuden kehittymiselle kytkeytyvät toisiinsa. Tässä väitöskirjatyössä tutkittiin toistuvan laihduttamisen ja tupakoinnin vaikutusta kehon painoon ja lisäksi tupakoinnin vaikutusta vyötärölihavuuden kehittymiseen. Työn toisena tavoitteena oli tutkia, kuinka voimakkaasti tupakointi ja toistuva laihduttaminen liittyvät toisiinsa suomalaisilla ja onko tämä yhteys erilainen eri ikäryhmissä ja sukupuolilla. Työ perustuu kolmeen laajaan kyselyaineistoon: Nuorten Kaksosten Terveystutkimuksen (englanniksi FinnTwin16) aineistossa on seurattu 1975-79 syntyneitä kaksosia 16, 17, 18 ja 24 vuoden ikäisinä (N=5563). Suomen kaksoskohortin aineisto (N= 12 793) on kerätty vuonna 1990 samaa sukupuolta olevilta, vuosina 1930-57 syntyneiltä kaksosilta. Entisten huippu-urheilijoiden (N=1838) ja heille kaltaistettujen verrokkien (N=834) seurantatutkimuksessa tiedot on kerätty vuosina 1985, 1995 ja 2001. Pituus, paino ja tupakointi on kysytty kaikissa kyselyissä. Kaksoset vastasivat laihdutuskäyttäytymistä koskeviin kysymyksiin. Urheilijoiden laihdutuskäyttäytyminen pääteltiin lajin perusteella, sillä toistuvan laihduttamisen tiedetään olevan yleistä painoluokissa urheilevilla urheilijoilla (esim.painijat, nyrkkeilijät). Nuoruusiän tupakointi ennusti vyötärölihavuutta molemmilla sukupuolilla ja lisäksi ylipainoisuutta naisilla. Toistuva laihduttaminen oli yhteydessä myöhempään painonnousuun ja lihavuuteen miehillä. Lisäksi toistuvan laihduttamisen ja tupakoinnin todettiin liittyvän toisiinsa nuorilla aikuisilla. Vanhemmissa ikäluokissa miehet, jotka tupakoivat, laihduttivat harvemmin kuin tupakoimattomat. Lihavuuteen ja vyötärölihavuuteen liittyvän oheissairastavuuden ennaltaehkäisyssä tupakoinnin ja toistuvan laihduttamisen vähentäminen saattavat olla aiemmin luultua tehokkaampia keinoja.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Previous microarray analyses identified 22 microRNAs (miRNAs) differentially expressed in paired ectopic and eutopic endometrium of women with and without endometriosis. To investigate further the role of these miRNAs in women with endometriosis, we conducted an association study aiming to explore the relationship between endometriosis risk and single-nucleotide polymorphisms (SNPs) in miRNA target sites for these differentially expressed miRNAs. A panel of 102 SNPs in the predicted miRNA binding sites were evaluated for an endometriosis association study and an ingenuity pathway analysis was performed. Fourteen rare variants were identified in this study. We found SNP rs14647 in the Wolf-Hirschhorn syndrome candidate gene1 (WHSC1) 3'UTR (untranslated region) was associated with endometriosis-related infertility presenting an odds ratio of 12.2 (95% confidence interval = 2.4-60.7, P = 9.03 x 10(-5)). SNP haplotype AGG in the solute carrier family 22, member 23 (SLC22A23) 3'UTR was associated with endometriosis-related infertility and more severe disease. With the individual genotyping data, ingenuity pathways analysis identified the tumour necrosis factor and cyclin-dependant kinase inhibitor as major factors in the molecular pathways. Significant associations between WHSC1 alleles and endometriosis-related infertility and SLC22A23 haplotypes and the disease severe stage were identified. These findings may help focus future research on subphenotypes of this disease. Replication studies in independent large sample sets to confirm and characterize the involvement of the gene variation in the pathogenesis of endometriosis are needed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Lypsylehmien maidon juoksettumiskyvyn jalostuskeinot Väitöskirjassa tutkittiin lypsylehmien maidon juustonvalmistuslaadun parantamista jalostusvalinnan avulla. Tutkimusaihe on tärkeä, sillä yhä suurempi osa maidosta käytetään juustonvalmistukseen. Tutkimuksen kohteena oli maidon juoksettumiskyky, sillä se on yksi keskeisistä juustomäärään vaikuttavista tekijöistä. Maidon juoksettumiskyky vaihteli huomattavasti lehmien, sonnien, karjojen, rotujen ja lypsykauden vaiheiden välillä. Vaikka tankkimaidon juoksettumiskyvyssä olikin suuria eroja karjoittain, karja selitti vain pienen osan juoksettumiskyvyn kokonaisvaihtelusta. Todennäköisesti perinnölliset erot lehmien välillä selittävät suurimman osan karjojen tankkimaitojen juoksettumiskyvyssä havaituista eroista. Hyvä hoito ja ruokinta vähensivät kuitenkin jossain määrin huonosti juoksettuvien tankkimaitojen osuutta karjoissa. Holstein-friisiläiset lehmät olivat juoksettumiskyvyltään ayrshire-rotuisia lehmiä parempia. Huono juoksettuminen ja juoksettumattomuus oli vain vähäinen ongelma holstein-friisiläisillä (10 %), kun taas kolmannes ayrshire-lehmistä tuotti huonosti juoksettuvaa tai juoksettumatonta maitoa. Maitoa sanotaan huonosti juoksettuvaksi silloin, kun juustomassa ei ole riittävän kiinteää leikattavaksi puolen tunnin kuluttua juoksetteen lisäyksestä. Juoksettumattomaksi määriteltävä maito ei saostu lainkaan puolen tunnin aikana ja on siksi erittäin huonoa raaka-ainetta juustomeijereille. Noin 40 % lehmien välisistä eroista maidon juoksettumiskyvyssä selittyi perinnöllisillä tekijöillä. Juoksettumiskykyä voikin sanoa hyvin periytyväksi ominaisuudeksi. Kolme mittauskertaa lehmää kohti riittää varsin hyvin lehmän maidon keskimääräisen juoksettumiskyvyn arvioimiseen. Tällä hetkellä juoksettumiskyvyn suoran jalostamisen ongelmana on kuitenkin automatisoidun, laajamittaiseen käyttöön soveltuvan mittalaitteen puute. Tämän takia väitöskirjassa tutkittiin mahdollisuuksia jalostaa maidon juoksettumiskykyä epäsuorasti, jonkin toisen ominaisuuden kautta. Tällaisen ominaisuuden pitää olla kyllin voimakkaasti perinnöllisesti kytkeytynyt juoksettumiskykyyn, jotta jalostus olisi mahdollista sen avulla. Tutkittavat ominaisuudet olivat sonnien kokonaisjalostusarvossa jo mukana olevat maitotuotos ja utareterveyteen liittyvät ominaisuudet sekä kokonaisjalostusarvoon kuulumattomat maidon valkuais- ja kaseiinipitoisuus sekä maidon pH. Väitöskirjassa tutkittiin myös mahdollisuuksia ns. merkkiavusteiseen valintaan tutkimalla maidon juoksettumattomuuden perinnöllisyyttä ja kartoittamalla siihen liittyvät kromosomialueet. Tutkimuksen tulosten perusteella lehmien utareterveyden jalostaminen parantaa jonkin verran myös maidon juoksettumiskykyä sekä vähentää juoksettumattomuutta ayrshire-rotuisilla lehmillä. Lehmien maitotuotos ja maidon juoksettumiskyky sekä juoksettumattomuus ovat sen sijaan perinnöllisesti toisistaan riippumattomia ominaisuuksia. Myöskin maidon valkuais- ja kaseiinipitoisuuden perinnöllinen yhteys juoksettumiskykyyn oli likimain nolla. Maidon pH:n ja juoksettumiskyvyn välillä oli melko voimakas perinnöllinen yhteys, joten maidon pH:n jalostaminen parantaisi myös maidon juoksettumiskykyä. Todennäköisesti sen jalostaminen ei kuitenkaan vähentäisi juoksettumatonta maitoa tuottavien lehmien määrää. Koska maidon juoksettumattomuus on niin yleinen ongelma suomalaisilla ayrshire-lehmillä, väitöksessä selvitettiin tarkemmin ilmiön taustoja. Kaikissa kolmessa tutkimusaineistoissa noin 10 % ayrshire-lehmistä tuotti juoksettumatonta maitoa. Kahden vuoden kuukausittaisen seurannan aikana osa lehmistä tuotti juoksettumatonta maitoa lähes joka mittauskerralla. Maidon juoksettumattomuus oli yhteydessä lypsykauden vaiheeseen, mutta mikään ympäristötekijöistä ei pystynyt täysin selittämään sitä. Sen sijaan viitteet sen periytyvyydestä vahvistuivat tutkimusten edetessä. Lopuksi tutkimusryhmä onnistui kartoittamaan juoksettumattomuutta aiheuttavat kromosomialueet kromosomeihin 2 ja 18, lähelle DNA-merkkejä BMS1126 ja BMS1355. Tulosten perusteella maidon juoksettumattomuus ei ole yhteydessä maidon juoksettumistapahtumassa keskeisiin kaseiinigeeneihin. Sen sijaan on mahdollista, että juoksettumattomuusongelman aiheuttavat kaseiinigeenien syntetisoinnin jälkeisessä muokkauksessa tapahtuvat virheet. Asia vaatii kuitenkin perusteellista tutkimista. Väitöksen tulosten perusteella maidon juoksettumattomuusgeeniä kantavien eläinten karsiminen jalostuseläinten joukosta olisi tehokkain tapa jalostaa maidon juoksettumiskykyä suomalaisessa lypsykarjapopulaatiossa.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A study was performed to investigate the value of near infrared reflectance spectroscopy (NIRS) as an alternate method to analytical techniques for identifying QTL associated with feed quality traits. Milled samples from an F6-derived recombinant inbred Tallon/Scarlett population were incubated in the rumen of fistulated cattle, recovered, washed and dried to determine the in-situ dry matter digestibility (DMD). Both pre- and post-digestion samples were analysed using NIRS to quantify key quality components relating to acid detergent fibre, starch and protein. This phenotypic data was used to identify trait associated QTL and compare them to previously identified QTL. Though a number of genetic correlations were identified between the phenotypic data sets, the only correlation of most interest was between DMD and starch digested (r = -0.382). The significance of this genetic correlation was that the NIRS data set identified a putative QTL on chromosomes 7H (LOD = 3.3) associated with starch digested. A QTL for DMD occurred in the same region of chromosome 7H, with flanking markers fAG/CAT63 and bPb-0758. The significant correlation and identification of this putative QTL, highlights the potential of technologies like NIRS in QTL analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This research studied distributed computing of all-to-all comparison problems with big data sets. The thesis formalised the problem, and developed a high-performance and scalable computing framework with a programming model, data distribution strategies and task scheduling policies to solve the problem. The study considered storage usage, data locality and load balancing for performance improvement in solving the problem. The research outcomes can be applied in bioinformatics, biometrics and data mining and other domains in which all-to-all comparisons are a typical computing pattern.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The treatment of large segmental bone defects remains a significant clinical challenge. Due to limitations surrounding the use of bone grafts, tissue-engineered constructs for the repair of large bone defects could offer an alternative. Before translation of any newly developed tissue engineering (TE) approach to the clinic, efficacy of the treatment must be shown in a validated preclinical large animal model. Currently, biomechanical testing, histology, and microcomputed tomography are performed to assess the quality and quantity of the regenerated bone. However, in vivo monitoring of the progression of healing is seldom performed, which could reveal important information regarding time to restoration of mechanical function and acceleration of regeneration. Furthermore, since the mechanical environment is known to influence bone regeneration, and limb loading of the animals can poorly be controlled, characterizing activity and load history could provide the ability to explain variability in the acquired data sets and potentially outliers based on abnormal loading. Many approaches have been devised to monitor the progression of healing and characterize the mechanical environment in fracture healing studies. In this article, we review previous methods and share results of recent work of our group toward developing and implementing a comprehensive biomechanical monitoring system to study bone regeneration in preclinical TE studies.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Remote sensing provides methods to infer land cover information over large geographical areas at a variety of spatial and temporal resolutions. Land cover is input data for a range of environmental models and information on land cover dynamics is required for monitoring the implications of global change. Such data are also essential in support of environmental management and policymaking. Boreal forests are a key component of the global climate and a major sink of carbon. The northern latitudes are expected to experience a disproportionate and rapid warming, which can have a major impact on vegetation at forest limits. This thesis examines the use of optical remote sensing for estimating aboveground biomass, leaf area index (LAI), tree cover and tree height in the boreal forests and tundra taiga transition zone in Finland. The continuous fields of forest attributes are required, for example, to improve the mapping of forest extent. The thesis focus on studying the feasibility of satellite data at multiple spatial resolutions, assessing the potential of multispectral, -angular and -temporal information, and provides regional evaluation for global land cover data. Preprocessed ASTER, MISR and MODIS products are the principal satellite data. The reference data consist of field measurements, forest inventory data and fine resolution land cover maps. Fine resolution studies demonstrate how statistical relationships between biomass and satellite data are relatively strong in single species and low biomass mountain birch forests in comparison to higher biomass coniferous stands. The combination of forest stand data and fine resolution ASTER images provides a method for biomass estimation using medium resolution MODIS data. The multiangular data improve the accuracy of land cover mapping in the sparsely forested tundra taiga transition zone, particularly in mires. Similarly, multitemporal data improve the accuracy of coarse resolution tree cover estimates in comparison to single date data. Furthermore, the peak of the growing season is not necessarily the optimal time for land cover mapping in the northern boreal regions. The evaluated coarse resolution land cover data sets have considerable shortcomings in northernmost Finland and should be used with caution in similar regions. The quantitative reference data and upscaling methods for integrating multiresolution data are required for calibration of statistical models and evaluation of land cover data sets. The preprocessed image products have potential for wider use as they can considerably reduce the time and effort used for data processing.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Bacteria play an important role in many ecological systems. The molecular characterization of bacteria using either cultivation-dependent or cultivation-independent methods reveals the large scale of bacterial diversity in natural communities, and the vastness of subpopulations within a species or genus. Understanding how bacterial diversity varies across different environments and also within populations should provide insights into many important questions of bacterial evolution and population dynamics. This thesis presents novel statistical methods for analyzing bacterial diversity using widely employed molecular fingerprinting techniques. The first objective of this thesis was to develop Bayesian clustering models to identify bacterial population structures. Bacterial isolates were identified using multilous sequence typing (MLST), and Bayesian clustering models were used to explore the evolutionary relationships among isolates. Our method involves the inference of genetic population structures via an unsupervised clustering framework where the dependence between loci is represented using graphical models. The population dynamics that generate such a population stratification were investigated using a stochastic model, in which homologous recombination between subpopulations can be quantified within a gene flow network. The second part of the thesis focuses on cluster analysis of community compositional data produced by two different cultivation-independent analyses: terminal restriction fragment length polymorphism (T-RFLP) analysis, and fatty acid methyl ester (FAME) analysis. The cluster analysis aims to group bacterial communities that are similar in composition, which is an important step for understanding the overall influences of environmental and ecological perturbations on bacterial diversity. A common feature of T-RFLP and FAME data is zero-inflation, which indicates that the observation of a zero value is much more frequent than would be expected, for example, from a Poisson distribution in the discrete case, or a Gaussian distribution in the continuous case. We provided two strategies for modeling zero-inflation in the clustering framework, which were validated by both synthetic and empirical complex data sets. We show in the thesis that our model that takes into account dependencies between loci in MLST data can produce better clustering results than those methods which assume independent loci. Furthermore, computer algorithms that are efficient in analyzing large scale data were adopted for meeting the increasing computational need. Our method that detects homologous recombination in subpopulations may provide a theoretical criterion for defining bacterial species. The clustering of bacterial community data include T-RFLP and FAME provides an initial effort for discovering the evolutionary dynamics that structure and maintain bacterial diversity in the natural environment.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Big Data and Learning Analytics’ promise to revolutionise educational institutions, endeavours, and actions through more and better data is now compelling. Multiple, and continually updating, data sets produce a new sense of ‘personalised learning’. A crucial attribute of the datafication, and subsequent profiling, of learner behaviour and engagement is the continual modification of the learning environment to induce greater levels of investment on the parts of each learner. The assumption is that more and better data, gathered faster and fed into ever-updating algorithms, provide more complete tools to understand, and therefore improve, learning experiences through adaptive personalisation. The argument in this paper is that Learning Personalisation names a new logistics of investment as the common ‘sense’ of the school, in which disciplinary education is ‘both disappearing and giving way to frightful continual training, to continual monitoring'.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background The Global Burden of Diseases (GBD), Injuries, and Risk Factors study used the disability-adjusted life year (DALY) to quantify the burden of diseases, injuries, and risk factors. This paper provides an overview of injury estimates from the 2013 update of GBD, with detailed information on incidence, mortality, DALYs and rates of change from 1990 to 2013 for 26 causes of injury, globally, by region and by country. Methods Injury mortality was estimated using the extensive GBD mortality database, corrections for ill-defined cause of death and the cause of death ensemble modelling tool. Morbidity estimation was based on inpatient and outpatient data sets, 26 cause-of-injury and 47 nature-of-injury categories, and seven follow-up studies with patient-reported long-term outcome measures. Results In 2013, 973 million (uncertainty interval (UI) 942 to 993) people sustained injuries that warranted some type of healthcare and 4.8 million (UI 4.5 to 5.1) people died from injuries. Between 1990 and 2013 the global age-standardised injury DALY rate decreased by 31% (UI 26% to 35%). The rate of decline in DALY rates was significant for 22 cause-of-injury categories, including all the major injuries. Conclusions Injuries continue to be an important cause of morbidity and mortality in the developed and developing world. The decline in rates for almost all injuries is so prominent that it warrants a general statement that the world is becoming a safer place to live in. However, the patterns vary widely by cause, age, sex, region and time and there are still large improvements that need to be made.