913 resultados para count data models


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The treatment of large segmental bone defects remains a significant clinical challenge. Due to limitations surrounding the use of bone grafts, tissue-engineered constructs for the repair of large bone defects could offer an alternative. Before translation of any newly developed tissue engineering (TE) approach to the clinic, efficacy of the treatment must be shown in a validated preclinical large animal model. Currently, biomechanical testing, histology, and microcomputed tomography are performed to assess the quality and quantity of the regenerated bone. However, in vivo monitoring of the progression of healing is seldom performed, which could reveal important information regarding time to restoration of mechanical function and acceleration of regeneration. Furthermore, since the mechanical environment is known to influence bone regeneration, and limb loading of the animals can poorly be controlled, characterizing activity and load history could provide the ability to explain variability in the acquired data sets and potentially outliers based on abnormal loading. Many approaches have been devised to monitor the progression of healing and characterize the mechanical environment in fracture healing studies. In this article, we review previous methods and share results of recent work of our group toward developing and implementing a comprehensive biomechanical monitoring system to study bone regeneration in preclinical TE studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The behaviour of laterally loaded piles is considerably influenced by the uncertainties in soil properties. Hence probabilistic models for assessment of allowable lateral load are necessary. Cone penetration test (CPT) data are often used to determine soil strength parameters, whereby the allowable lateral load of the pile is computed. In the present study, the maximum lateral displacement and moment of the pile are obtained based on the coefficient of subgrade reaction approach, considering the nonlinear soil behaviour in undrained clay. The coefficient of subgrade reaction is related to the undrained shear strength of soil, which can be obtained from CPT data. The soil medium is modelled as a one-dimensional random field along the depth, and it is described by the standard deviation and scale of fluctuation of the undrained shear strength of soil. Inherent soil variability, measurement uncertainty and transformation uncertainty are taken into consideration. The statistics of maximum lateral deflection and moment are obtained using the first-order, second-moment technique. Hasofer-Lind reliability indices for component and system failure criteria, based on the allowable lateral displacement and moment capacity of the pile section, are evaluated. The geotechnical database from the Konaseema site in India is used as a case example. It is shown that the reliability-based design approach for pile foundations, considering the spatial variability of soil, permits a rational choice of allowable lateral loads.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Whether a statistician wants to complement a probability model for observed data with a prior distribution and carry out fully probabilistic inference, or base the inference only on the likelihood function, may be a fundamental question in theory, but in practice it may well be of less importance if the likelihood contains much more information than the prior. Maximum likelihood inference can be justified as a Gaussian approximation at the posterior mode, using flat priors. However, in situations where parametric assumptions in standard statistical models would be too rigid, more flexible model formulation, combined with fully probabilistic inference, can be achieved using hierarchical Bayesian parametrization. This work includes five articles, all of which apply probability modeling under various problems involving incomplete observation. Three of the papers apply maximum likelihood estimation and two of them hierarchical Bayesian modeling. Because maximum likelihood may be presented as a special case of Bayesian inference, but not the other way round, in the introductory part of this work we present a framework for probability-based inference using only Bayesian concepts. We also re-derive some results presented in the original articles using the toolbox equipped herein, to show that they are also justifiable under this more general framework. Here the assumption of exchangeability and de Finetti's representation theorem are applied repeatedly for justifying the use of standard parametric probability models with conditionally independent likelihood contributions. It is argued that this same reasoning can be applied also under sampling from a finite population. The main emphasis here is in probability-based inference under incomplete observation due to study design. This is illustrated using a generic two-phase cohort sampling design as an example. The alternative approaches presented for analysis of such a design are full likelihood, which utilizes all observed information, and conditional likelihood, which is restricted to a completely observed set, conditioning on the rule that generated that set. Conditional likelihood inference is also applied for a joint analysis of prevalence and incidence data, a situation subject to both left censoring and left truncation. Other topics covered are model uncertainty and causal inference using posterior predictive distributions. We formulate a non-parametric monotonic regression model for one or more covariates and a Bayesian estimation procedure, and apply the model in the context of optimal sequential treatment regimes, demonstrating that inference based on posterior predictive distributions is feasible also in this case.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Advancements in the analysis techniques have led to a rapid accumulation of biological data in databases. Such data often are in the form of sequences of observations, examples including DNA sequences and amino acid sequences of proteins. The scale and quality of the data give promises of answering various biologically relevant questions in more detail than what has been possible before. For example, one may wish to identify areas in an amino acid sequence, which are important for the function of the corresponding protein, or investigate how characteristics on the level of DNA sequence affect the adaptation of a bacterial species to its environment. Many of the interesting questions are intimately associated with the understanding of the evolutionary relationships among the items under consideration. The aim of this work is to develop novel statistical models and computational techniques to meet with the challenge of deriving meaning from the increasing amounts of data. Our main concern is on modeling the evolutionary relationships based on the observed molecular data. We operate within a Bayesian statistical framework, which allows a probabilistic quantification of the uncertainties related to a particular solution. As the basis of our modeling approach we utilize a partition model, which is used to describe the structure of data by appropriately dividing the data items into clusters of related items. Generalizations and modifications of the partition model are developed and applied to various problems. Large-scale data sets provide also a computational challenge. The models used to describe the data must be realistic enough to capture the essential features of the current modeling task but, at the same time, simple enough to make it possible to carry out the inference in practice. The partition model fulfills these two requirements. The problem-specific features can be taken into account by modifying the prior probability distributions of the model parameters. The computational efficiency stems from the ability to integrate out the parameters of the partition model analytically, which enables the use of efficient stochastic search algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genetics, the science of heredity and variation in living organisms, has a central role in medicine, in breeding crops and livestock, and in studying fundamental topics of biological sciences such as evolution and cell functioning. Currently the field of genetics is under a rapid development because of the recent advances in technologies by which molecular data can be obtained from living organisms. In order that most information from such data can be extracted, the analyses need to be carried out using statistical models that are tailored to take account of the particular genetic processes. In this thesis we formulate and analyze Bayesian models for genetic marker data of contemporary individuals. The major focus is on the modeling of the unobserved recent ancestry of the sampled individuals (say, for tens of generations or so), which is carried out by using explicit probabilistic reconstructions of the pedigree structures accompanied by the gene flows at the marker loci. For such a recent history, the recombination process is the major genetic force that shapes the genomes of the individuals, and it is included in the model by assuming that the recombination fractions between the adjacent markers are known. The posterior distribution of the unobserved history of the individuals is studied conditionally on the observed marker data by using a Markov chain Monte Carlo algorithm (MCMC). The example analyses consider estimation of the population structure, relatedness structure (both at the level of whole genomes as well as at each marker separately), and haplotype configurations. For situations where the pedigree structure is partially known, an algorithm to create an initial state for the MCMC algorithm is given. Furthermore, the thesis includes an extension of the model for the recent genetic history to situations where also a quantitative phenotype has been measured from the contemporary individuals. In that case the goal is to identify positions on the genome that affect the observed phenotypic values. This task is carried out within the Bayesian framework, where the number and the relative effects of the quantitative trait loci are treated as random variables whose posterior distribution is studied conditionally on the observed genetic and phenotypic data. In addition, the thesis contains an extension of a widely-used haplotyping method, the PHASE algorithm, to settings where genetic material from several individuals has been pooled together, and the allele frequencies of each pool are determined in a single genotyping.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work develops methods to account for shoot structure in models of coniferous canopy radiative transfer. Shoot structure, as it varies along the light gradient inside canopy, affects the efficiency of light interception per unit needle area, foliage biomass, or foliage nitrogen. The clumping of needles in the shoot volume also causes a notable amount of multiple scattering of light within coniferous shoots. The effect of shoot structure on light interception is treated in the context of canopy level photosynthesis and resource use models, and the phenomenon of within-shoot multiple scattering in the context of physical canopy reflectance models for remote sensing purposes. Light interception. A method for estimating the amount of PAR (Photosynthetically Active Radiation) intercepted by a conifer shoot is presented. The method combines modelling of the directional distribution of radiation above canopy, fish-eye photographs taken at shoot locations to measure canopy gap fraction, and geometrical measurements of shoot orientation and structure. Data on light availability, shoot and needle structure and nitrogen content has been collected from canopies of Pacific silver fir (Abies amabilis (Dougl.) Forbes) and Norway spruce (Picea abies (L.) Karst.). Shoot structure acclimated to light gradient inside canopy so that more shaded shoots have better light interception efficiency. Light interception efficiency of shoots varied about two-fold per needle area, about four-fold per needle dry mass, and about five-fold per nitrogen content. Comparison of fertilized and control stands of Norway spruce indicated that light interception efficiency is not greatly affected by fertilization. Light scattering. Structure of coniferous shoots gives rise to multiple scattering of light between the needles of the shoot. Using geometric models of shoots, multiple scattering was studied by photon tracing simulations. Based on simulation results, the dependence of the scattering coefficient of shoot from the scattering coefficient of needles is shown to follow a simple one-parameter model. The single parameter, termed the recollision probability, describes the level of clumping of the needles in the shoot, is wavelength independent, and can be connected to previously used clumping indices. By using the recollision probability to correct for the within-shoot multiple scattering, canopy radiative transfer models which have used leaves as basic elements can use shoots as basic elements, and thus be applied for coniferous forests. Preliminary testing of this approach seems to explain, at least partially, why coniferous forests appear darker than broadleaved forests in satellite data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Disease maps are effective tools for explaining and predicting patterns of disease outcomes across geographical space, identifying areas of potentially elevated risk, and formulating and validating aetiological hypotheses for a disease. Bayesian models have become a standard approach to disease mapping in recent decades. This article aims to provide a basic understanding of the key concepts involved in Bayesian disease mapping methods for areal data. It is anticipated that this will help in interpretation of published maps, and provide a useful starting point for anyone interested in running disease mapping methods for areal data. The article provides detailed motivation and descriptions on disease mapping methods by explaining the concepts, defining the technical terms, and illustrating the utility of disease mapping for epidemiological research by demonstrating various ways of visualising model outputs using a case study. The target audience includes spatial scientists in health and other fields, policy or decision makers, health geographers, spatial analysts, public health professionals, and epidemiologists.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fisheries management agencies around the world collect age data for the purpose of assessing the status of natural resources in their jurisdiction. Estimates of mortality rates represent a key information to assess the sustainability of fish stocks exploitation. Contrary to medical research or manufacturing where survival analysis is routinely applied to estimate failure rates, survival analysis has seldom been applied in fisheries stock assessment despite similar purposes between these fields of applied statistics. In this paper, we developed hazard functions to model the dynamic of an exploited fish population. These functions were used to estimate all parameters necessary for stock assessment (including natural and fishing mortality rates as well as gear selectivity) by maximum likelihood using age data from a sample of catch. This novel application of survival analysis to fisheries stock assessment was tested by Monte Carlo simulations to assert that it provided unbiased estimations of relevant quantities. The method was applied to the data from the Queensland (Australia) sea mullet (Mugil cephalus) commercial fishery collected between 2007 and 2014. It provided, for the first time, an estimate of natural mortality affecting this stock: 0.22±0.08 year −1 .

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Large-scale chromosome rearrangements such as copy number variants (CNVs) and inversions encompass a considerable proportion of the genetic variation between human individuals. In a number of cases, they have been closely linked with various inheritable diseases. Single-nucleotide polymorphisms (SNPs) are another large part of the genetic variance between individuals. They are also typically abundant and their measuring is straightforward and cheap. This thesis presents computational means of using SNPs to detect the presence of inversions and deletions, a particular variety of CNVs. Technically, the inversion-detection algorithm detects the suppressed recombination rate between inverted and non-inverted haplotype populations whereas the deletion-detection algorithm uses the EM-algorithm to estimate the haplotype frequencies of a window with and without a deletion haplotype. As a contribution to population biology, a coalescent simulator for simulating inversion polymorphisms has been developed. Coalescent simulation is a backward-in-time method of modelling population ancestry. Technically, the simulator also models multiple crossovers by using the Counting model as the chiasma interference model. Finally, this thesis includes an experimental section. The aforementioned methods were tested on synthetic data to evaluate their power and specificity. They were also applied to the HapMap Phase II and Phase III data sets, yielding a number of candidates for previously unknown inversions, deletions and also correctly detecting known such rearrangements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this Thesis, we develop theory and methods for computational data analysis. The problems in data analysis are approached from three perspectives: statistical learning theory, the Bayesian framework, and the information-theoretic minimum description length (MDL) principle. Contributions in statistical learning theory address the possibility of generalization to unseen cases, and regression analysis with partially observed data with an application to mobile device positioning. In the second part of the Thesis, we discuss so called Bayesian network classifiers, and show that they are closely related to logistic regression models. In the final part, we apply the MDL principle to tracing the history of old manuscripts, and to noise reduction in digital signals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A central tenet in the theory of reliability modelling is the quantification of the probability of asset failure. In general, reliability depends on asset age and the maintenance policy applied. Usually, failure and maintenance times are the primary inputs to reliability models. However, for many organisations, different aspects of these data are often recorded in different databases (e.g. work order notifications, event logs, condition monitoring data, and process control data). These recorded data cannot be interpreted individually, since they typically do not have all the information necessary to ascertain failure and preventive maintenance times. This paper presents a methodology for the extraction of failure and preventive maintenance times using commonly-available, real-world data sources. A text-mining approach is employed to extract keywords indicative of the source of the maintenance event. Using these keywords, a Naïve Bayes classifier is then applied to attribute each machine stoppage to one of two classes: failure or preventive. The accuracy of the algorithm is assessed and the classified failure time data are then presented. The applicability of the methodology is demonstrated on a maintenance data set from an Australian electricity company.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is common to model the dynamics of fisheries using natural and fishing mortality rates estimated independently using two separate analyses. Fishing mortality is routinely estimated from widely available logbook data, whereas natural mortality estimations have often required more specific, less frequently available, data. However, in the case of the fishery for brown tiger prawn (Penaeus esculentus) in Moreton Bay, both fishing and natural mortality rates have been estimated from logbook data. The present work extended the fishing mortality model to incorporate an eco-physiological response of tiger prawn to temperature, and allowed recruitment timing to vary from year to year. These ecological characteristics of the dynamics of this fishery were ignored in the separate model that estimated natural mortality. Therefore, we propose to estimate both natural and fishing mortality rates within a single model using a consistent set of hypotheses. This approach was applied to Moreton Bay brown tiger prawn data collected between 1990 and 2010. Natural mortality was estimated by maximum likelihood to be equal to 0.032 ± 0.002 week−1, approximately 30% lower than the fixed value used in previous models of this fishery (0.045 week−1).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In a very recent study [1] the Renormalisation Group (RNG) turbulence model was used to obtain flow predictions in a strongly swirling quarl burner, and was found to perform well in predicting certain features that are not well captured using less sophisticated models of turbulence. The implication is that the RNG approach should provide an economical and reliable tool for the prediction of swirling flows in combustor and furnace geometries commonly encountered in technological applications. To test this hypothesis the present work considers flow in a model furnace for which experimental data is available [2]. The essential features of the flow which differentiate it from the previous study [1] are that the annular air jet entry is relatively narrow and the base wall of the cylindrical furnace is at 90 degrees to the inlet pipe. For swirl numbers of order 1 the resulting flow is highly complex with significant inner and outer recirculation regions. The RNG and standard k-epsilon models are used to model the flow for both swirling and non-swirling entry jets and the results compared with experimental data [2]. Near wall viscous effects are accounted for in both models via the standard wall function formulation [3]. For the RNG model, additional computations with grid placement extending well inside the near wall viscous-affected sublayer are performed in order to assess the low Reynolds number capabilities of the model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work we numerically model isothermal turbulent swirling flow in a cylindrical burner. Three versions of the RNG k-epsilon model are assessed against performance of the standard k-epsilon model. Sensitivity of numerical predictions to grid refinement, differing convective differencing schemes and choice of (unknown) inlet dissipation rate, were closely scrutinised to ensure accuracy. Particular attention is paid to modelling the inlet conditions to within the range of uncertainty of the experimental data, as model predictions proved to be significantly sensitive to relatively small changes in upstream flow conditions. We also examine the characteristics of the swirl--induced recirculation zone predicted by the models over an extended range of inlet conditions. Our main findings are: - (i) the standard k-epsilon model performed best compared with experiment; - (ii) no one inlet specification can simultaneously optimize the performance of the models considered; - (iii) the RNG models predict both single-cell and double-cell IRZ characteristics, the latter both with and without additional internal stagnation points. The first finding indicates that the examined RNG modifications to the standard k-e model do not result in an improved eddy viscosity based model for the prediction of swirl flows. The second finding suggests that tuning established models for optimal performance in swirl flows a priori is not straightforward. The third finding indicates that the RNG based models exhibit a greater variety of structural behaviour, despite being of the same level of complexity as the standard k-e model. The plausibility of the predicted IRZ features are discussed in terms of known vortex breakdown phenomena.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis introduced two novel reputation models to generate accurate item reputation scores using ratings data and the statistics of the dataset. It also presented an innovative method that incorporates reputation awareness in recommender systems by employing voting system methods to produce more accurate top-N item recommendations. Additionally, this thesis introduced a personalisation method for generating reputation scores based on users' interests, where a single item can have different reputation scores for different users. The personalised reputation scores are then used in the proposed reputation-aware recommender systems to enhance the recommendation quality.