42 resultados para Statistical Language Model
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. Recent advances in machine learning offer a novel approach to model spatial distribution of petrophysical properties in complex reservoirs alternative to geostatistics. The approach is based of semisupervised learning, which handles both ?labelled? observed data and ?unlabelled? data, which have no measured value but describe prior knowledge and other relevant data in forms of manifolds in the input space where the modelled property is continuous. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic geological features and describe stochastic variability and non-uniqueness of spatial properties. On the other hand, it is able to capture and preserve key spatial dependencies such as connectivity of high permeability geo-bodies, which is often difficult in contemporary petroleum reservoir studies. Semi-supervised SVR as a data driven algorithm is designed to integrate various kind of conditioning information and learn dependences from it. The semi-supervised SVR model is able to balance signal/noise levels and control the prior belief in available data. In this work, stochastic semi-supervised SVR geomodel is integrated into Bayesian framework to quantify uncertainty of reservoir production with multiple models fitted to past dynamic observations (production history). Multiple history matched models are obtained using stochastic sampling and/or MCMC-based inference algorithms, which evaluate posterior probability distribution. Uncertainty of the model is described by posterior probability of the model parameters that represent key geological properties: spatial correlation size, continuity strength, smoothness/variability of spatial property distribution. The developed approach is illustrated with a fluvial reservoir case. The resulting probabilistic production forecasts are described by uncertainty envelopes. The paper compares the performance of the models with different combinations of unknown parameters and discusses sensitivity issues.
Resumo:
Familial searching consists of searching for a full profile left at a crime scene in a National DNA Database (NDNAD). In this paper we are interested in the circumstance where no full match is returned, but a partial match is found between a database member's profile and the crime stain. Because close relatives share more of their DNA than unrelated persons, this partial match may indicate that the crime stain was left by a close relative of the person with whom the partial match was found. This approach has successfully solved important crimes in the UK and the USA. In a previous paper, a model, which takes into account substructure and siblings, was used to simulate a NDNAD. In this paper, we have used this model to test the usefulness of familial searching and offer guidelines for pre-assessment of the cases based on the likelihood ratio. Siblings of "persons" present in the simulated Swiss NDNAD were created. These profiles (N=10,000) were used as traces and were then compared to the whole database (N=100,000). The statistical results obtained show that the technique has great potential confirming the findings of previous studies. However, effectiveness of the technique is only one part of the story. Familial searching has juridical and ethical aspects that should not be ignored. In Switzerland for example, there are no specific guidelines to the legality or otherwise of familial searching. This article both presents statistical results, and addresses criminological and civil liberties aspects to take into account risks and benefits of familial searching.
Resumo:
Due to their performance enhancing properties, use of anabolic steroids (e.g. testosterone, nandrolone, etc.) is banned in elite sports. Therefore, doping control laboratories accredited by the World Anti-Doping Agency (WADA) screen among others for these prohibited substances in urine. It is particularly challenging to detect misuse with naturally occurring anabolic steroids such as testosterone (T), which is a popular ergogenic agent in sports and society. To screen for misuse with these compounds, drug testing laboratories monitor the urinary concentrations of endogenous steroid metabolites and their ratios, which constitute the steroid profile and compare them with reference ranges to detect unnaturally high values. However, the interpretation of the steroid profile is difficult due to large inter-individual variances, various confounding factors and different endogenous steroids marketed that influence the steroid profile in various ways. A support vector machine (SVM) algorithm was developed to statistically evaluate urinary steroid profiles composed of an extended range of steroid profile metabolites. This model makes the interpretation of the analytical data in the quest for deviating steroid profiles feasible and shows its versatility towards different kinds of misused endogenous steroids. The SVM model outperforms the current biomarkers with respect to detection sensitivity and accuracy, particularly when it is coupled to individual data as stored in the Athlete Biological Passport.
Resumo:
La tomodensitométrie (CT) est une technique d'imagerie dont l'intérêt n'a cessé de croître depuis son apparition dans le début des années 70. Dans le domaine médical, son utilisation est incontournable à tel point que ce système d'imagerie pourrait être amené à devenir victime de son succès si son impact au niveau de l'exposition de la population ne fait pas l'objet d'une attention particulière. Bien évidemment, l'augmentation du nombre d'examens CT a permis d'améliorer la prise en charge des patients ou a rendu certaines procédures moins invasives. Toutefois, pour assurer que le compromis risque - bénéfice soit toujours en faveur du patient, il est nécessaire d'éviter de délivrer des doses non utiles au diagnostic.¦Si cette action est importante chez l'adulte elle doit être une priorité lorsque les examens se font chez l'enfant, en particulier lorsque l'on suit des pathologies qui nécessitent plusieurs examens CT au cours de la vie du patient. En effet, les enfants et jeunes adultes sont plus radiosensibles. De plus, leur espérance de vie étant supérieure à celle de l'adulte, ils présentent un risque accru de développer un cancer radio-induit dont la phase de latence peut être supérieure à vingt ans. Partant du principe que chaque examen radiologique est justifié, il devient dès lors nécessaire d'optimiser les protocoles d'acquisitions pour s'assurer que le patient ne soit pas irradié inutilement. L'avancée technologique au niveau du CT est très rapide et depuis 2009, de nouvelles techniques de reconstructions d'images, dites itératives, ont été introduites afin de réduire la dose et améliorer la qualité d'image.¦Le présent travail a pour objectif de déterminer le potentiel des reconstructions itératives statistiques pour réduire au minimum les doses délivrées lors d'examens CT chez l'enfant et le jeune adulte tout en conservant une qualité d'image permettant le diagnostic, ceci afin de proposer des protocoles optimisés.¦L'optimisation d'un protocole d'examen CT nécessite de pouvoir évaluer la dose délivrée et la qualité d'image utile au diagnostic. Alors que la dose est estimée au moyen d'indices CT (CTDIV0| et DLP), ce travail a la particularité d'utiliser deux approches radicalement différentes pour évaluer la qualité d'image. La première approche dite « physique », se base sur le calcul de métriques physiques (SD, MTF, NPS, etc.) mesurées dans des conditions bien définies, le plus souvent sur fantômes. Bien que cette démarche soit limitée car elle n'intègre pas la perception des radiologues, elle permet de caractériser de manière rapide et simple certaines propriétés d'une image. La seconde approche, dite « clinique », est basée sur l'évaluation de structures anatomiques (critères diagnostiques) présentes sur les images de patients. Des radiologues, impliqués dans l'étape d'évaluation, doivent qualifier la qualité des structures d'un point de vue diagnostique en utilisant une échelle de notation simple. Cette approche, lourde à mettre en place, a l'avantage d'être proche du travail du radiologue et peut être considérée comme méthode de référence.¦Parmi les principaux résultats de ce travail, il a été montré que les algorithmes itératifs statistiques étudiés en clinique (ASIR?, VEO?) ont un important potentiel pour réduire la dose au CT (jusqu'à-90%). Cependant, par leur fonctionnement, ils modifient l'apparence de l'image en entraînant un changement de texture qui pourrait affecter la qualité du diagnostic. En comparant les résultats fournis par les approches « clinique » et « physique », il a été montré que ce changement de texture se traduit par une modification du spectre fréquentiel du bruit dont l'analyse permet d'anticiper ou d'éviter une perte diagnostique. Ce travail montre également que l'intégration de ces nouvelles techniques de reconstruction en clinique ne peut se faire de manière simple sur la base de protocoles utilisant des reconstructions classiques. Les conclusions de ce travail ainsi que les outils développés pourront également guider de futures études dans le domaine de la qualité d'image, comme par exemple, l'analyse de textures ou la modélisation d'observateurs pour le CT.¦-¦Computed tomography (CT) is an imaging technique in which interest has been growing since it first began to be used in the early 1970s. In the clinical environment, this imaging system has emerged as the gold standard modality because of its high sensitivity in producing accurate diagnostic images. However, even if a direct benefit to patient healthcare is attributed to CT, the dramatic increase of the number of CT examinations performed has raised concerns about the potential negative effects of ionizing radiation on the population. To insure a benefit - risk that works in favor of a patient, it is important to balance image quality and dose in order to avoid unnecessary patient exposure.¦If this balance is important for adults, it should be an absolute priority for children undergoing CT examinations, especially for patients suffering from diseases requiring several follow-up examinations over the patient's lifetime. Indeed, children and young adults are more sensitive to ionizing radiation and have an extended life span in comparison to adults. For this population, the risk of developing cancer, whose latency period exceeds 20 years, is significantly higher than for adults. Assuming that each patient examination is justified, it then becomes a priority to optimize CT acquisition protocols in order to minimize the delivered dose to the patient. Over the past few years, CT advances have been developing at a rapid pace. Since 2009, new iterative image reconstruction techniques, called statistical iterative reconstructions, have been introduced in order to decrease patient exposure and improve image quality.¦The goal of the present work was to determine the potential of statistical iterative reconstructions to reduce dose as much as possible without compromising image quality and maintain diagnosis of children and young adult examinations.¦The optimization step requires the evaluation of the delivered dose and image quality useful to perform diagnosis. While the dose is estimated using CT indices (CTDIV0| and DLP), the particularity of this research was to use two radically different approaches to evaluate image quality. The first approach, called the "physical approach", computed physical metrics (SD, MTF, NPS, etc.) measured on phantoms in well-known conditions. Although this technique has some limitations because it does not take radiologist perspective into account, it enables the physical characterization of image properties in a simple and timely way. The second approach, called the "clinical approach", was based on the evaluation of anatomical structures (diagnostic criteria) present on patient images. Radiologists, involved in the assessment step, were asked to score image quality of structures for diagnostic purposes using a simple rating scale. This approach is relatively complicated to implement and also time-consuming. Nevertheless, it has the advantage of being very close to the practice of radiologists and is considered as a reference method.¦Primarily, this work revealed that the statistical iterative reconstructions studied in clinic (ASIR? and VECO have a strong potential to reduce CT dose (up to -90%). However, by their mechanisms, they lead to a modification of the image appearance with a change in image texture which may then effect the quality of the diagnosis. By comparing the results of the "clinical" and "physical" approach, it was showed that a change in texture is related to a modification of the noise spectrum bandwidth. The NPS analysis makes possible to anticipate or avoid a decrease in image quality. This project demonstrated that integrating these new statistical iterative reconstruction techniques can be complex and cannot be made on the basis of protocols using conventional reconstructions. The conclusions of this work and the image quality tools developed will be able to guide future studies in the field of image quality as texture analysis or model observers dedicated to CT.
Resumo:
Introduction: Neuroimaging of the self focused on high-level mechanisms such as language, memory or imagery of the self. Recent evidence suggests that low-level mechanisms of multisensory and sensorimotor integration may play a fundamental role in encoding self-location and the first-person perspective (Blanke and Metzinger, 2009). Neurological patients with out-of body experiences (OBE) suffer from abnormal self-location and the first-person perspective due to a damage in the temporo-parietal junction (Blanke et al., 2004). Although self-location and the first-person perspective can be studied experimentally (Lenggenhager et al., 2009), the neural underpinnings of self-location have yet to be investigated. To investigate the brain network involved in self-location and first-person perspective we used visuo-tactile multisensory conflict, magnetic resonance (MR)-compatible robotics, and fMRI in study 1, and lesion analysis in a sample of 9 patients with OBE due to focal brain damage in study 2. Methods: Twenty-two participants saw a video showing either a person's back or an empty room being stroked (visual stimuli) while the MR-compatible robotic device stroked their back (tactile stimulation). Direction and speed of the seen stroking could either correspond (synchronous) or not (asynchronous) to those of the seen stroking. Each run comprised the four conditions according to a 2x2 factorial design with Object (Body, No-Body) and Synchrony (Synchronous, Asynchronous) as main factors. Self-location was estimated using the mental ball dropping (MBD; Lenggenhager et al., 2009). After the fMRI session participants completed a 6-item adapted from the original questionnaire created by Botvinick and Cohen (1998) and based on questions and data obtained by Lenggenhager et al. (2007, 2009). They were also asked to complete a questionnaire to disclose the perspective they adopted during the illusion. Response times (RTs) for the MBD and fMRI data were analyzed with a 3-way mixed model ANOVA with the in-between factor Perspective (up, down) and the two with-in factors Object (body, no-body) and Stroking (synchronous, asynchronous). Quantitative lesion analysis was performed using MRIcron (Rorden et al., 2007). We compared the distributions of brain lesions confirmed by multimodality imaging (Knowlton, 2004) in patients with OBE with those showing complex visual hallucinations involving people or faces, but without any disturbance of self-location and first person perspective. Nine patients with OBE were investigated. The control group comprised 8 patients. Structural imaging data were available for normalization and co-registration in all the patients. Normalization of each patient's lesion into the common MNI (Montreal Neurological Institute) reference space permitted simple, voxel-wise, algebraic comparisons to be made. Results: Even if in the scanner all participants were lying on their back and were facing upwards, analysis of perspective showed that half of the participants had the impression to be looking down at the virtual human body below them, despite any cues about their body position (Down-group). The other participants had the impression to be looking up at the virtual body above them (Up-group). Analysis of Q3 ("How strong was the feeling that the body you saw was you?") indicated stronger self-identification with the virtual body during the synchronous stroking. RTs in the MBD task confirmed these subjective data (significant 3-way interaction between perspective, object and stroking). fMRI results showed eight cortical regions where the BOLD signal was significantly different during at least one of the conditions resulting from the combination of Object and Stroking, relative to baseline: right and left temporo-parietal junction, right EBA, left middle occipito-temporal gyrus, left postcentral gyrus, right medial parietal lobe, bilateral medial occipital lobe (Fig 1). The activation patterns in right and left temporo-parietal junction and right EBA reflected changes in self-location and perspective as revealed by statistical analysis that was performed on the percentage of BOLD change with respect to the baseline. Statistical lesion overlap comparison (using nonparametric voxel based lesion symptom mapping) with respect to the control group revealed the right temporo-parietal junction, centered at the angular gyrus (Talairach coordinates x = 54, y =-52, z = 26; p>0.05, FDR corrected). Conclusions: The present questionnaire and behavioural results show that - despite the noisy and constraining MR environment) our participants had predictable changes in self-location, self-identification, and first-person perspective when robotic tactile stroking was applied synchronously with the robotic visual stroking. fMRI data in healthy participants and lesion data in patients with abnormal self-location and first-person perspective jointly revealed that the temporo-parietal cortex especially in the right hemisphere encodes these conscious experiences. We argue that temporo-parietal activity reflects the experience of the conscious "I" as embodied and localized within bodily space.
Resumo:
PURPOSE: In the radiopharmaceutical therapy approach to the fight against cancer, in particular when it comes to translating laboratory results to the clinical setting, modeling has served as an invaluable tool for guidance and for understanding the processes operating at the cellular level and how these relate to macroscopic observables. Tumor control probability (TCP) is the dosimetric end point quantity of choice which relates to experimental and clinical data: it requires knowledge of individual cellular absorbed doses since it depends on the assessment of the treatment's ability to kill each and every cell. Macroscopic tumors, seen in both clinical and experimental studies, contain too many cells to be modeled individually in Monte Carlo simulation; yet, in particular for low ratios of decays to cells, a cell-based model that does not smooth away statistical considerations associated with low activity is a necessity. The authors present here an adaptation of the simple sphere-based model from which cellular level dosimetry for macroscopic tumors and their end point quantities, such as TCP, may be extrapolated more reliably. METHODS: Ten homogenous spheres representing tumors of different sizes were constructed in GEANT4. The radionuclide 131I was randomly allowed to decay for each model size and for seven different ratios of number of decays to number of cells, N(r): 1000, 500, 200, 100, 50, 20, and 10 decays per cell. The deposited energy was collected in radial bins and divided by the bin mass to obtain the average bin absorbed dose. To simulate a cellular model, the number of cells present in each bin was calculated and an absorbed dose attributed to each cell equal to the bin average absorbed dose with a randomly determined adjustment based on a Gaussian probability distribution with a width equal to the statistical uncertainty consistent with the ratio of decays to cells, i.e., equal to Nr-1/2. From dose volume histograms the surviving fraction of cells, equivalent uniform dose (EUD), and TCP for the different scenarios were calculated. Comparably sized spherical models containing individual spherical cells (15 microm diameter) in hexagonal lattices were constructed, and Monte Carlo simulations were executed for all the same previous scenarios. The dosimetric quantities were calculated and compared to the adjusted simple sphere model results. The model was then applied to the Bortezomib-induced enzyme-targeted radiotherapy (BETR) strategy of targeting Epstein-Barr virus (EBV)-expressing cancers. RESULTS: The TCP values were comparable to within 2% between the adjusted simple sphere and full cellular models. Additionally, models were generated for a nonuniform distribution of activity, and results were compared between the adjusted spherical and cellular models with similar comparability. The TCP values from the experimental macroscopic tumor results were consistent with the experimental observations for BETR-treated 1 g EBV-expressing lymphoma tumors in mice. CONCLUSIONS: The adjusted spherical model presented here provides more accurate TCP values than simple spheres, on par with full cellular Monte Carlo simulations while maintaining the simplicity of the simple sphere model. This model provides a basis for complementing and understanding laboratory and clinical results pertaining to radiopharmaceutical therapy.
Resumo:
Swain corrects the chi-square overidentification test (i.e., likelihood ratio test of fit) for structural equation models whethr with or without latent variables. The chi-square statistic is asymptotically correct; however, it does not behave as expected in small samples and/or when the model is complex (cf. Herzog, Boomsma, & Reinecke, 2007). Thus, particularly in situations where the ratio of sample size (n) to the number of parameters estimated (p) is relatively small (i.e., the p to n ratio is large), the chi-square test will tend to overreject correctly specified models. To obtain a closer approximation to the distribution of the chi-square statistic, Swain (1975) developed a correction; this scaling factor, which converges to 1 asymptotically, is multiplied with the chi-square statistic. The correction better approximates the chi-square distribution resulting in more appropriate Type 1 reject error rates (see Herzog & Boomsma, 2009; Herzog, et al., 2007).
Resumo:
When back-calculating fish length from scale measurements, the choice of the body-scale relationship is a fundamental step. Using data from the arctic charrSalvelinus alpinus (L.) of Lake Geneva (Switzerland) we show the need for a curvilinear model, on both statistical and biological grounds. From several 2-parameters models, the log-linear relationship appears to provide the best fit. A 3-parameters, Bertalanffy model did not improve the fit. We show moreover that using the proportional model would lead to important misinterpretations of the data.
Resumo:
We propose a method for brain atlas deformation in the presence of large space-occupying tumors, based on an a priori model of lesion growth that assumes radial expansion of the lesion from its starting point. Our approach involves three steps. First, an affine registration brings the atlas and the patient into global correspondence. Then, the seeding of a synthetic tumor into the brain atlas provides a template for the lesion. The last step is the deformation of the seeded atlas, combining a method derived from optical flow principles and a model of lesion growth. Results show that a good registration is performed and that the method can be applied to automatic segmentation of structures and substructures in brains with gross deformation, with important medical applications in neurosurgery, radiosurgery, and radiotherapy.
Resumo:
In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).
Resumo:
Analysis of variance is commonly used in morphometry in order to ascertain differences in parameters between several populations. Failure to detect significant differences between populations (type II error) may be due to suboptimal sampling and lead to erroneous conclusions; the concept of statistical power allows one to avoid such failures by means of an adequate sampling. Several examples are given in the morphometry of the nervous system, showing the use of the power of a hierarchical analysis of variance test for the choice of appropriate sample and subsample sizes. In the first case chosen, neuronal densities in the human visual cortex, we find the number of observations to be of little effect. For dendritic spine densities in the visual cortex of mice and humans, the effect is somewhat larger. A substantial effect is shown in our last example, dendritic segmental lengths in monkey lateral geniculate nucleus. It is in the nature of the hierarchical model that sample size is always more important than subsample size. The relative weight to be attributed to subsample size thus depends on the relative magnitude of the between observations variance compared to the between individuals variance.
Resumo:
The predictive potential of six selected factors was assessed in 72 patients with primary myelodysplastic syndrome using univariate and multivariate logistic regression analysis of survival at 18 months. Factors were age (above median of 69 years), dysplastic features in the three myeloid bone marrow cell lineages, presence of chromosome defects, all metaphases abnormal, double or complex chromosome defects (C23), and a Bournemouth score of 2, 3, or 4 (B234). In the multivariate approach, B234 and C23 proved to be significantly associated with a reduction in the survival probability. The similarity of the regression coefficients associated with these two factors means that they have about the same weight. Consequently, the model was simplified by counting the number of factors (0, 1, or 2) present in each patient, thus generating a scoring system called the Lausanne-Bournemouth score (LB score). The LB score combines the well-recognized and easy-to-use Bournemouth score (B score) with the chromosome defect complexity, C23 constituting an additional indicator of patient outcome. The predicted risk of death within 18 months calculated from the model is as follows: 7.1% (confidence interval: 1.7-24.8) for patients with an LB score of 0, 60.1% (44.7-73.8) for an LB score of 1, and 96.8% (84.5-99.4) for an LB score of 2. The scoring system presented here has several interesting features. The LB score may improve the predictive value of the B score, as it is able to recognize two prognostic groups in the intermediate risk category of patients with B scores of 2 or 3. It has also the ability to identify two distinct prognostic subclasses among RAEB and possibly CMML patients. In addition to its above-described usefulness in the prognostic evaluation, the LB score may bring new insights into the understanding of evolution patterns in MDS. We used the combination of the B score and chromosome complexity to define four classes which may be considered four possible states of myelodysplasia and which describe two distinct evolutional pathways.
Resumo:
BACKGROUND: Qualitative frameworks, especially those based on the logical discrete formalism, are increasingly used to model regulatory and signalling networks. A major advantage of these frameworks is that they do not require precise quantitative data, and that they are well-suited for studies of large networks. While numerous groups have developed specific computational tools that provide original methods to analyse qualitative models, a standard format to exchange qualitative models has been missing. RESULTS: We present the Systems Biology Markup Language (SBML) Qualitative Models Package ("qual"), an extension of the SBML Level 3 standard designed for computer representation of qualitative models of biological networks. We demonstrate the interoperability of models via SBML qual through the analysis of a specific signalling network by three independent software tools. Furthermore, the collective effort to define the SBML qual format paved the way for the development of LogicalModel, an open-source model library, which will facilitate the adoption of the format as well as the collaborative development of algorithms to analyse qualitative models. CONCLUSIONS: SBML qual allows the exchange of qualitative models among a number of complementary software tools. SBML qual has the potential to promote collaborative work on the development of novel computational approaches, as well as on the specification and the analysis of comprehensive qualitative models of regulatory and signalling networks.