942 resultados para Data Driven Modeling
Resumo:
This investigation deals with the question of when a particular population can be considered to be disease-free. The motivation is the case of BSE where specific birth cohorts may present distinct disease-free subpopulations. The specific objective is to develop a statistical approach suitable for documenting freedom of disease, in particular, freedom from BSE in birth cohorts. The approach is based upon a geometric waiting time distribution for the occurrence of positive surveillance results and formalizes the relationship between design prevalence, cumulative sample size and statistical power. The simple geometric waiting time model is further modified to account for the diagnostic sensitivity and specificity associated with the detection of disease. This is exemplified for BSE using two different models for the diagnostic sensitivity. The model is furthermore modified in such a way that a set of different values for the design prevalence in the surveillance streams can be accommodated (prevalence heterogeneity) and a general expression for the power function is developed. For illustration, numerical results for BSE suggest that currently (data status September 2004) a birth cohort of Danish cattle born after March 1999 is free from BSE with probability (power) of 0.8746 or 0.8509, depending on the choice of a model for the diagnostic sensitivity.
Recent developments in genetic data analysis: what can they tell us about human demographic history?
Resumo:
Over the last decade, a number of new methods of population genetic analysis based on likelihood have been introduced. This review describes and explains the general statistical techniques that have recently been used, and discusses the underlying population genetic models. Experimental papers that use these methods to infer human demographic and phylogeographic history are reviewed. It appears that the use of likelihood has hitherto had little impact in the field of human population genetics, which is still primarily driven by more traditional approaches. However, with the current uncertainty about the effects of natural selection, population structure and ascertainment of single-nucleotide polymorphism markers, it is suggested that likelihood-based methods may have a greater impact in the future.
Resumo:
Analyses of high-density single-nucleotide polymorphism (SNP) data, such as genetic mapping and linkage disequilibrium (LD) studies, require phase-known haplotypes to allow for the correlation between tightly linked loci. However, current SNP genotyping technology cannot determine phase, which must be inferred statistically. In this paper, we present a new Bayesian Markov chain Monte Carlo (MCMC) algorithm for population haplotype frequency estimation, particulary in the context of LD assessment. The novel feature of the method is the incorporation of a log-linear prior model for population haplotype frequencies. We present simulations to suggest that 1) the log-linear prior model is more appropriate than the standard coalescent process in the presence of recombination (>0.02cM between adjacent loci), and 2) there is substantial inflation in measures of LD obtained by a "two-stage" approach to the analysis by treating the "best" haplotype configuration as correct, without regard to uncertainty in the recombination process. Genet Epidemiol 25:106-114, 2003. (C) 2003 Wiley-Liss, Inc.
Resumo:
A model for the structure of amorphous molybdenum trisulfide, a-MoS3, has been created using reverse Monte Carlo methods. This model, which consists of chains Of MoS6 units sharing three sulfurs with each of its two neighbors and forming alternate long, nonbonded, and short, bonded, Mo-Mo separations, is a good fit to the neutron diffraction data and is chemically and physically realistic. The paper identifies the limitations of previous models based on Mo-3 triangular clusters in accounting for the available experimental data.
Resumo:
Two studies investigated the degree to which the relationship between rapid automatized naming (RAN) performance and reading development is driven by shared phonological processes. Study 1 assessed RAN, phonological awareness, and reading performance in 1010 7- to -10 year-olds. Results showed that RAN deficits occurred in the absence of phonological awareness deficits. These were accompanied by modest reading delays. In structural equation modeling, solutions where RAN was subsumed within a phonological processing factor did not provide a good fit to the data, suggesting that processes outside phonology may drive RAN performance and its association with reading. Study 2 investigated Kail’s proposal that speed of processing underlies this relationship. Children with single RAN deficits showed slower speed of processing than did closely matched controls performing normally on RAN. However, regression analysis revealed that RAN made a unique contribution to reading even after accounting for processing speed. Theoretical implications are discussed.
Resumo:
Covariation in the structural composition of the gut microbiome and the spectroscopically derived metabolic phenotype (metabotype) of a rodent model for obesity were investigated using a range of multivariate statistical tools. Urine and plasma samples from three strains of 10-week-old male Zucker rats (obese (fa/fa, n = 8), lean (fal-, n = 8) and lean (-/-, n = 8)) were characterized via high-resolution H-1 NMR spectroscopy, and in parallel, the fecal microbial composition was investigated using fluorescence in situ hydridization (FISH) and denaturing gradient gel electrophoresis (DGGE) methods. All three Zucker strains had different relative abundances of the dominant members of their intestinal microbiota (FISH), with the novel observation of a Halomonas and a Sphingomonas species being present in the (fa/fa) obese strain on the basis of DGGE data. The two functionally and phenotypically normal Zucker strains (fal- and -/-) were readily distinguished from the (fa/fa) obese rats on the basis of their metabotypes with relatively lower urinary hippurate and creatinine, relatively higher levels of urinary isoleucine, leucine and acetate and higher plasma LDL and VLDL levels typifying the (fa/fa) obese strain. Collectively, these data suggest a conditional host genetic involvement in selection of the microbial species in each host strain, and that both lean and obese animals could have specific metabolic phenotypes that are linked to their individual microbiomes.
Resumo:
A new primary model based on a thermodynamically consistent first-order kinetic approach was constructed to describe non-log-linear inactivation kinetics of pressure-treated bacteria. The model assumes a first-order process in which the specific inactivation rate changes inversely with the square root of time. The model gave reasonable fits to experimental data over six to seven orders of magnitude. It was also tested on 138 published data sets and provided good fits in about 70% of cases in which the shape of the curve followed the typical convex upward form. In the remainder of published examples, curves contained additional shoulder regions or extended tail regions. Curves with shoulders could be accommodated by including an additional time delay parameter and curves with tails shoulders could be accommodated by omitting points in the tail beyond the point at which survival levels remained more or less constant. The model parameters varied regularly with pressure, which may reflect a genuine mechanistic basis for the model. This property also allowed the calculation of (a) parameters analogous to the decimal reduction time D and z, the temperature increase needed to change the D value by a factor of 10, in thermal processing, and hence the processing conditions needed to attain a desired level of inactivation; and (b) the apparent thermodynamic volumes of activation associated with the lethal events. The hypothesis that inactivation rates changed as a function of the square root of time would be consistent with a diffusion-limited process.
Resumo:
Quantitative control of aroma generation during the Maillard reaction presents great scientific and industrial interest. Although there have been many studies conducted in simplified model systems, the results are difficult to apply to complex food systems, where the presence of other components can have a significant impact. In this work, an aqueous extract of defatted beef liver was chosen as a simplified food matrix for studying the kinetics of the Mallard reaction. Aliquots of the extract were heated under different time and temperature conditions and analyzed for sugars, amino acids, and methylbutanals, which are important Maillard-derived aroma compounds formed in cooked meat. Multiresponse kinetic modeling, based on a simplified mechanistic pathway, gave a good fit with the experimental data, but only when additional steps were introduced to take into account the interactions of glucose and glucose-derived intermediates with protein and other amino compounds. This emphasizes the significant role of the food matrix in controlling the Maillard reaction.
Resumo:
Studies of ignorance-driven decision making have been employed to analyse when ignorance should prove advantageous on theoretical grounds or else they have been employed to examine whether human behaviour is consistent with an ignorance-driven inference strategy (e. g., the recognition heuristic). In the current study we examine whether-under conditions where such inferences might be expected-the advantages that theoretical analyses predict are evident in human performance data. A single experiment shows that, when asked to make relative wealth judgements, participants reliably use recognition as a basis for their judgements. Their wealth judgements under these conditions are reliably more accurate when some of the target names are unknown than when participants recognize all of the names (a "less-is-more effect"). These results are consistent across a number of variations: the number of options given to participants and the nature of the wealth judgement. A basic model of recognition-based inference predicts these effects.
Resumo:
Inverse problems for dynamical system models of cognitive processes comprise the determination of synaptic weight matrices or kernel functions for neural networks or neural/dynamic field models, respectively. We introduce dynamic cognitive modeling as a three tier top-down approach where cognitive processes are first described as algorithms that operate on complex symbolic data structures. Second, symbolic expressions and operations are represented by states and transformations in abstract vector spaces. Third, prescribed trajectories through representation space are implemented in neurodynamical systems. We discuss the Amari equation for a neural/dynamic field theory as a special case and show that the kernel construction problem is particularly ill-posed. We suggest a Tikhonov-Hebbian learning method as regularization technique and demonstrate its validity and robustness for basic examples of cognitive computations.
Resumo:
A large volume of visual content is inaccessible until effective and efficient indexing and retrieval of such data is achieved. In this paper, we introduce the DREAM system, which is a knowledge-assisted semantic-driven context-aware visual information retrieval system applied in the film post production domain. We mainly focus on the automatic labelling and topic map related aspects of the framework. The use of the context- related collateral knowledge, represented by a novel probabilistic based visual keyword co-occurrence matrix, had been proven effective via the experiments conducted during system evaluation. The automatically generated semantic labels were fed into the Topic Map Engine which can automatically construct ontological networks using Topic Maps technology, which dramatically enhances the indexing and retrieval performance of the system towards an even higher semantic level.
Resumo:
The paper introduces an efficient construction algorithm for obtaining sparse linear-in-the-weights regression models based on an approach of directly optimizing model generalization capability. This is achieved by utilizing the delete-1 cross validation concept and the associated leave-one-out test error also known as the predicted residual sums of squares (PRESS) statistic, without resorting to any other validation data set for model evaluation in the model construction process. Computational efficiency is ensured using an orthogonal forward regression, but the algorithm incrementally minimizes the PRESS statistic instead of the usual sum of the squared training errors. A local regularization method can naturally be incorporated into the model selection procedure to further enforce model sparsity. The proposed algorithm is fully automatic, and the user is not required to specify any criterion to terminate the model construction procedure. Comparisons with some of the existing state-of-art modeling methods are given, and several examples are included to demonstrate the ability of the proposed algorithm to effectively construct sparse models that generalize well.
Resumo:
Many kernel classifier construction algorithms adopt classification accuracy as performance metrics in model evaluation. Moreover, equal weighting is often applied to each data sample in parameter estimation. These modeling practices often become problematic if the data sets are imbalanced. We present a kernel classifier construction algorithm using orthogonal forward selection (OFS) in order to optimize the model generalization for imbalanced two-class data sets. This kernel classifier identification algorithm is based on a new regularized orthogonal weighted least squares (ROWLS) estimator and the model selection criterion of maximal leave-one-out area under curve (LOO-AUC) of the receiver operating characteristics (ROCs). It is shown that, owing to the orthogonalization procedure, the LOO-AUC can be calculated via an analytic formula based on the new regularized orthogonal weighted least squares parameter estimator, without actually splitting the estimation data set. The proposed algorithm can achieve minimal computational expense via a set of forward recursive updating formula in searching model terms with maximal incremental LOO-AUC value. Numerical examples are used to demonstrate the efficacy of the algorithm.
Resumo:
Although extensively studied within the lidar community, the multiple scattering phenomenon has always been considered a rare curiosity by radar meteorologists. Up to few years ago its appearance has only been associated with two- or three-body-scattering features (e.g. hail flares and mirror images) involving highly reflective surfaces. Recent atmospheric research aimed at better understanding of the water cycle and the role played by clouds and precipitation in affecting the Earth's climate has driven the deployment of high frequency radars in space. Examples are the TRMM 13.5 GHz, the CloudSat 94 GHz, the upcoming EarthCARE 94 GHz, and the GPM dual 13-35 GHz radars. These systems are able to detect the vertical distribution of hydrometeors and thus provide crucial feedbacks for radiation and climate studies. The shift towards higher frequencies increases the sensitivity to hydrometeors, improves the spatial resolution and reduces the size and weight of the radar systems. On the other hand, higher frequency radars are affected by stronger extinction, especially in the presence of large precipitating particles (e.g. raindrops or hail particles), which may eventually drive the signal below the minimum detection threshold. In such circumstances the interpretation of the radar equation via the single scattering approximation may be problematic. Errors will be large when the radiation emitted from the radar after interacting more than once with the medium still contributes substantially to the received power. This is the case if the transport mean-free-path becomes comparable with the instrument footprint (determined by the antenna beam-width and the platform altitude). This situation resembles to what has already been experienced in lidar observations, but with a predominance of wide- versus small-angle scattering events. At millimeter wavelengths, hydrometeors diffuse radiation rather isotropically compared to the visible or near infrared region where scattering is predominantly in the forward direction. A complete understanding of radiation transport modeling and data analysis methods under wide-angle multiple scattering conditions is mandatory for a correct interpretation of echoes observed by space-borne millimeter radars. This paper reviews the status of research in this field. Different numerical techniques currently implemented to account for higher order scattering are reviewed and their weaknesses and strengths highlighted. Examples of simulated radar backscattering profiles are provided with particular emphasis given to situations in which the multiple scattering contributions become comparable or overwhelm the single scattering signal. We show evidences of multiple scattering effects from air-borne and from CloudSat observations, i.e. unique signatures which cannot be explained by single scattering theory. Ideas how to identify and tackle the multiple scattering effects are discussed. Finally perspectives and suggestions for future work are outlined. This work represents a reference-guide for studies focused at modeling the radiation transport and at interpreting data from high frequency space-borne radar systems that probe highly opaque scattering media such as thick ice clouds or precipitating clouds.
Resumo:
Strong vertical gradients at the top of the atmospheric boundary layer affect the propagation of electromagnetic waves and can produce radar ducts. A three-dimensional, time-dependent, nonhydrostatic numerical model was used to simulate the propagation environment in the atmosphere over the Persian Gulf when aircraft observations of ducting had been made. A division of the observations into high- and low-wind cases was used as a framework for the simulations. Three sets of simulations were conducted with initial conditions of varying degrees of idealization and were compared with the observations taken in the Ship Antisubmarine Warfare Readiness/Effectiveness Measuring (SHAREM-115) program. The best results occurred with the initialization based on a sounding taken over the coast modified by the inclusion of data on low-level atmospheric conditions over the Gulf waters. The development of moist, cool, stable marine internal boundary layers (MIBL) in air flowing from land over the waters of the Gulf was simulated. The MIBLs were capped by temperature inversions and associated lapses of humidity and refractivity. The low-wind MIBL was shallower and the gradients at its top were sharper than in the high-wind case, in agreement with the observations. Because it is also forced by land–sea contrasts, a sea-breeze circulation frequently occurs in association with the MIBL. The size, location, and internal structure of the sea-breeze circulation were realistically simulated. The gradients of temperature and humidity that bound the MIBL cause perturbations in the refractivity distribution that, in turn, lead to trapping layers and ducts. The existence, location, and surface character of the ducts were well captured. Horizontal variations in duct characteristics due to the sea-breeze circulation were also evident. The simulations successfully distinguished between high- and low-wind occasions, a notable feature of the SHAREM-115 observations. The modeled magnitudes of duct depth and strength, although leaving scope for improvement, were most encouraging.