939 resultados para Statistical Language Model
Resumo:
The purpose of this study is to develop statistical methodology to facilitate indirect estimation of the concentration of antiretroviral drugs and viral loads in the prostate gland and the seminal vesicle. The differences in antiretroviral drug concentrations in these organs may lead to suboptimal concentrations in one gland compared to the other. Suboptimal levels of the antiretroviral drugs will not be able to fully suppress the virus in that gland, lead to a source of sexually transmissible virus and increase the chance of selecting for drug resistant virus. This information may be useful selecting antiretroviral drug regimen that will achieve optimal concentrations in most of male genital tract glands. Using fractionally collected semen ejaculates, Lundquist (1949) measured levels of surrogate markers in each fraction that are uniquely produced by specific male accessory glands. To determine the original glandular concentrations of the surrogate markers, Lundquist solved a simultaneous series of linear equations. This method has several limitations. In particular, it does not yield a unique solution, it does not address measurement error, and it disregards inter-subject variability in the parameters. To cope with these limitations, we developed a mechanistic latent variable model based on the physiology of the male genital tract and surrogate markers. We employ a Bayesian approach and perform a sensitivity analysis with regard to the distributional assumptions on the random effects and priors. The model and Bayesian approach is validated on experimental data where the concentration of a drug should be (biologically) differentially distributed between the two glands. In this example, the Bayesian model-based conclusions are found to be robust to model specification and this hierarchical approach leads to more scientifically valid conclusions than the original methodology. In particular, unlike existing methods, the proposed model based approach was not affected by a common form of outliers.
Resumo:
Knowledge of the time interval from death (post-mortem interval, PMI) has an enormous legal, criminological and psychological impact. Aiming to find an objective method for the determination of PMIs in forensic medicine, 1H-MR spectroscopy (1H-MRS) was used in a sheep head model to follow changes in brain metabolite concentrations after death. Following the characterization of newly observed metabolites (Ith et al., Magn. Reson. Med. 2002; 5: 915-920), the full set of acquired spectra was analyzed statistically to provide a quantitative estimation of PMIs with their respective confidence limits. In a first step, analytical mathematical functions are proposed to describe the time courses of 10 metabolites in the decomposing brain up to 3 weeks post-mortem. Subsequently, the inverted functions are used to predict PMIs based on the measured metabolite concentrations. Individual PMIs calculated from five different metabolites are then pooled, being weighted by their inverse variances. The predicted PMIs from all individual examinations in the sheep model are compared with known true times. In addition, four human cases with forensically estimated PMIs are compared with predictions based on single in situ MRS measurements. Interpretation of the individual sheep examinations gave a good correlation up to 250 h post-mortem, demonstrating that the predicted PMIs are consistent with the data used to generate the model. Comparison of the estimated PMIs with the forensically determined PMIs in the four human cases shows an adequate correlation. Current PMI estimations based on forensic methods typically suffer from uncertainties in the order of days to weeks without mathematically defined confidence information. In turn, a single 1H-MRS measurement of brain tissue in situ results in PMIs with defined and favorable confidence intervals in the range of hours, thus offering a quantitative and objective method for the determination of PMIs.
Resumo:
Constructing a 3D surface model from sparse-point data is a nontrivial task. Here, we report an accurate and robust approach for reconstructing a surface model of the proximal femur from sparse-point data and a dense-point distribution model (DPDM). The problem is formulated as a three-stage optimal estimation process. The first stage, affine registration, is to iteratively estimate a scale and a rigid transformation between the mean surface model of the DPDM and the sparse input points. The estimation results of the first stage are used to establish point correspondences for the second stage, statistical instantiation, which stably instantiates a surface model from the DPDM using a statistical approach. This surface model is then fed to the third stage, kernel-based deformation, which further refines the surface model. Handling outliers is achieved by consistently employing the least trimmed squares (LTS) approach with a roughly estimated outlier rate in all three stages. If an optimal value of the outlier rate is preferred, we propose a hypothesis testing procedure to automatically estimate it. We present here our validations using four experiments, which include 1 leave-one-out experiment, 2 experiment on evaluating the present approach for handling pathology, 3 experiment on evaluating the present approach for handling outliers, and 4 experiment on reconstructing surface models of seven dry cadaver femurs using clinically relevant data without noise and with noise added. Our validation results demonstrate the robust performance of the present approach in handling outliers, pathology, and noise. An average 95-percentile error of 1.7-2.3 mm was found when the present approach was used to reconstruct surface models of the cadaver femurs from sparse-point data with noise added.
Resumo:
An appropriate model of recent human evolution is not only important to understand our own history, but it is necessary to disentangle the effects of demography and selection on genome diversity. Although most genetic data support the view that our species originated recently in Africa, it is still unclear if it completely replaced former members of the Homo genus, or if some interbreeding occurred during its range expansion. Several scenarios of modern human evolution have been proposed on the basis of molecular and paleontological data, but their likelihood has never been statistically assessed. Using DNA data from 50 nuclear loci sequenced in African, Asian and Native American samples, we show here by extensive simulations that a simple African replacement model with exponential growth has a higher probability (78%) as compared with alternative multiregional evolution or assimilation scenarios. A Bayesian analysis of the data under this best supported model points to an origin of our species approximately 141 thousand years ago (Kya), an exit out-of-Africa approximately 51 Kya, and a recent colonization of the Americas approximately 10.5 Kya. We also find that the African replacement model explains not only the shallow ancestry of mtDNA or Y-chromosomes but also the occurrence of deep lineages at some autosomal loci, which has been formerly interpreted as a sign of interbreeding with Homo erectus.
Resumo:
Turbulence affects traditional free space optical communication by causing speckle to appear in the received beam profile. This occurs due to changes in the refractive index of the atmosphere that are caused by fluctuations in temperature and pressure, resulting in an inhomogeneous medium. The Gaussian-Schell model of partial coherence has been suggested as a means of mitigating these atmospheric inhomogeneities on the transmission side. This dissertation analyzed the Gaussian-Schell model of partial coherence by verifying the Gaussian-Schell model in the far-field, investigated the number of independent phase control screens necessary to approach the ideal Gaussian-Schell model, and showed experimentally that the Gaussian-Schell model of partial coherence is achievable in the far-field using a liquid crystal spatial light modulator. A method for optimizing the statistical properties of the Gaussian-Schell model was developed to maximize the coherence of the field while ensuring that it does not exhibit the same statistics as a fully coherent source. Finally a technique to estimate the minimum spatial resolution necessary in a spatial light modulator was developed to effectively propagate the Gaussian-Schell model through a range of atmospheric turbulence strengths. This work showed that regardless of turbulence strength or receiver aperture, transmitting the Gaussian-Schell model of partial coherence instead of a fully coherent source will yield a reduction in the intensity fluctuations of the received field. By measuring the variance of the intensity fluctuations and the received mean, it is shown through the scintillation index that using the Gaussian-Schell model of partial coherence is a simple and straight forward method to mitigate atmospheric turbulence instead of traditional adaptive optics in free space optical communications.
Resumo:
The developmental processes and functions of an organism are controlled by the genes and the proteins that are derived from these genes. The identification of key genes and the reconstruction of gene networks can provide a model to help us understand the regulatory mechanisms for the initiation and progression of biological processes or functional abnormalities (e.g. diseases) in living organisms. In this dissertation, I have developed statistical methods to identify the genes and transcription factors (TFs) involved in biological processes, constructed their regulatory networks, and also evaluated some existing association methods to find robust methods for coexpression analyses. Two kinds of data sets were used for this work: genotype data and gene expression microarray data. On the basis of these data sets, this dissertation has two major parts, together forming six chapters. The first part deals with developing association methods for rare variants using genotype data (chapter 4 and 5). The second part deals with developing and/or evaluating statistical methods to identify genes and TFs involved in biological processes, and construction of their regulatory networks using gene expression data (chapter 2, 3, and 6). For the first part, I have developed two methods to find the groupwise association of rare variants with given diseases or traits. The first method is based on kernel machine learning and can be applied to both quantitative as well as qualitative traits. Simulation results showed that the proposed method has improved power over the existing weighted sum method (WS) in most settings. The second method uses multiple phenotypes to select a few top significant genes. It then finds the association of each gene with each phenotype while controlling the population stratification by adjusting the data for ancestry using principal components. This method was applied to GAW 17 data and was able to find several disease risk genes. For the second part, I have worked on three problems. First problem involved evaluation of eight gene association methods. A very comprehensive comparison of these methods with further analysis clearly demonstrates the distinct and common performance of these eight gene association methods. For the second problem, an algorithm named the bottom-up graphical Gaussian model was developed to identify the TFs that regulate pathway genes and reconstruct their hierarchical regulatory networks. This algorithm has produced very significant results and it is the first report to produce such hierarchical networks for these pathways. The third problem dealt with developing another algorithm called the top-down graphical Gaussian model that identifies the network governed by a specific TF. The network produced by the algorithm is proven to be of very high accuracy.
Resumo:
A range of societal issues have been caused by fossil fuel consumption in the transportation sector in the United States (U.S.), including health related air pollution, climate change, the dependence on imported oil, and other oil related national security concerns. Biofuels production from various lignocellulosic biomass types such as wood, forest residues, and agriculture residues have the potential to replace a substantial portion of the total fossil fuel consumption. This research focuses on locating biofuel facilities and designing the biofuel supply chain to minimize the overall cost. For this purpose an integrated methodology was proposed by combining the GIS technology with simulation and optimization modeling methods. The GIS based methodology was used as a precursor for selecting biofuel facility locations by employing a series of decision factors. The resulted candidate sites for biofuel production served as inputs for simulation and optimization modeling. As a precursor to simulation or optimization modeling, the GIS-based methodology was used to preselect potential biofuel facility locations for biofuel production from forest biomass. Candidate locations were selected based on a set of evaluation criteria, including: county boundaries, a railroad transportation network, a state/federal road transportation network, water body (rivers, lakes, etc.) dispersion, city and village dispersion, a population census, biomass production, and no co-location with co-fired power plants. The simulation and optimization models were built around key supply activities including biomass harvesting/forwarding, transportation and storage. The built onsite storage served for spring breakup period where road restrictions were in place and truck transportation on certain roads was limited. Both models were evaluated using multiple performance indicators, including cost (consisting of the delivered feedstock cost, and inventory holding cost), energy consumption, and GHG emissions. The impact of energy consumption and GHG emissions were expressed in monetary terms to keep consistent with cost. Compared with the optimization model, the simulation model represents a more dynamic look at a 20-year operation by considering the impacts associated with building inventory at the biorefinery to address the limited availability of biomass feedstock during the spring breakup period. The number of trucks required per day was estimated and the inventory level all year around was tracked. Through the exchange of information across different procedures (harvesting, transportation, and biomass feedstock processing procedures), a smooth flow of biomass from harvesting areas to a biofuel facility was implemented. The optimization model was developed to address issues related to locating multiple biofuel facilities simultaneously. The size of the potential biofuel facility is set up with an upper bound of 50 MGY and a lower bound of 30 MGY. The optimization model is a static, Mathematical Programming Language (MPL)-based application which allows for sensitivity analysis by changing inputs to evaluate different scenarios. It was found that annual biofuel demand and biomass availability impacts the optimal results of biofuel facility locations and sizes.
Resumo:
This paper presents a system for 3-D reconstruction of a patient-specific surface model from calibrated X-ray images. Our system requires two X-ray images of a patient with one acquired from the anterior-posterior direction and the other from the axial direction. A custom-designed cage is utilized in our system to calibrate both images. Starting from bone contours that are interactively identified from the X-ray images, our system constructs a patient-specific surface model of the proximal femur based on a statistical model based 2D/3D reconstruction algorithm. In this paper, we present the design and validation of the system with 25 bones. An average reconstruction error of 0.95 mm was observed.
Resumo:
PURPOSE: To compare dynamic contrast material-enhanced magnetic resonance (MR) imaging and diffusion-weighted MR imaging for noninvasive evaluation of early and late effects of a vascular targeting agent in a rat tumor model. MATERIALS AND METHODS: The study protocol was approved by the local ethics committee for animal care and use. Thirteen rats with one rhabdomyosarcoma in each flank (26 tumors) underwent dynamic contrast-enhanced imaging and diffusion-weighted echo-planar imaging in a 1.5-T MR unit before intraperitoneal injection of combretastatin A4 phosphate and at early (1 and 6 hours) and later (2 and 9 days) follow-up examinations after the injection. Histopathologic examination was performed at each time point. The apparent diffusion coefficient (ADC) of each tumor was calculated separately on the basis of diffusion-weighted images obtained with low b gradient values (ADC(low); b = 0, 50, and 100 sec/mm(2)) and high b gradient values (ADC(high); b = 500, 750, and 1000 sec/mm(2)). The difference between ADC(low) and ADC(high) was used as a surrogate measure of tissue perfusion (ADC(low) - ADC(high) = ADC(perf)). From the dynamic contrast-enhanced MR images, the volume transfer constant k and the initial slope of the contrast enhancement-time curve were calculated. For statistical analyses, a paired two-tailed Student t test and linear regression analysis were used. RESULTS: Early after administration of combretastatin, all perfusion-related parameters (k, initial slope, and ADC(perf)) decreased significantly (P < .001); at 9 days after combretastatin administration, they increased significantly (P < .001). Changes in ADC(perf) were correlated with changes in k (R(2) = 0.46, P < .001) and the initial slope (R(2) = 0.67, P < .001). CONCLUSION: Both dynamic contrast-enhanced MR imaging and diffusion-weighted MR imaging allow monitoring of perfusion changes induced by vascular targeting agents in tumors. Diffusion-weighted imaging provides additional information about intratumoral cell viability versus necrosis after administration of combretastatin.
Resumo:
Transplantation of fetal dopaminergic (DA) neurons offers an experimental therapy for Parkinson's disease (PD). The low availability and the poor survival and integration of transplanted cells in the host brain are major obstacles in this approach. Glial cell line-derived neurotrophic factor (GDNF) is a potent neurotrophic factor with growth- and survival-promoting capabilities for developing DA neurons. In the present study, we examined whether pretreatment of ventral mesencephalic (VM) free-floating roller tube (FFRT) cultures with GDNF would improve graft survival and function. For that purpose organotypic cultures of E14 rat VM were grown for 2, 4 or 8 days in the absence (control) or presence of GDNF [10 ng/ml] and transplanted into the striatum of 6-hydroxydopamine-lesioned rats. While all groups of rats showed a significant reduction in d-amphetamine-induced rotations at 6 weeks posttransplantation a significantly improved graft function was observed only in the days in vitro (DIV) 4 GDNF pretreated group compared to the control group. In addition, no statistical significant differences between groups were found in the number of surviving tyrosine hydroxylase-immunoreactive (TH-ir) neurons assessed at 9 weeks posttransplantation. However, a tendency for higher TH-ir fiber outgrowth from the transplants in the GDNF pretreated groups as compared to corresponding controls was observed. Furthermore, GDNF pretreatment showed a tendency for a higher number of GIRK2 positive neurons in the grafts. In sum, our findings demonstrate that GDNF pretreatment was not disadvantageous for transplants of embryonic rat VM with the FFRT culture technique but only marginally improved graft survival and function.
Resumo:
Ziel dieses Beitrages ist die Analyse der Anwendung empirischer Tests in der deutschsprachigen Sportpsychologie. Die Ergebnisse vergleichbarer Analysen, bspw. in der Psychologie, zeigen, dass zwischen Anforderungen aus Testkonzepten und empirischer Realität Unterschiede existieren, die bislang für die Sportpsychologie nicht beschrieben und bewertet worden sind. Die Jahrgänge 1994–2007 der Zeitschrift für Sportpsychologie (früher psychologie und sport) wurden danach untersucht, ob Forschungsfragen formuliert, welche Stichprobenart gewählt, welches Testkonzept verwendet, welches Signifikanzniveau benutzt und ob statistische Probleme diskutiert wurden. 83 Artikel wurden von zwei unabhängigen Bewertern nach diesen Aspekten kategorisiert. Als Ergebnis ist festzuhalten, dass in der sportpsychologischen Forschung überwiegend eine Mischung aus Fishers Signifikanztesten sowie Neyman-Pearsons-Hypothesentesten zur Anwendung kommt,das sogenannte „Hybrid-Modell” oder „Null-Ritual”. Die Beschreibung der Teststärke ist kaum zu beobachten. Eine zeitliche Analyse der Beiträge zeigt, dass vor allem die Benutzung von Effektgrößen in den letzten Jahren zugenommen hat. Abschließend werden Ansätze zur Verbesserung und der Vereinheitlichung der Anwendung empirischer Tests vorgeschlagen und diskutiert.
Resumo:
Software must be constantly adapted to changing requirements. The time scale, abstraction level and granularity of adaptations may vary from short-term, fine-grained adaptation to long-term, coarse-grained evolution. Fine-grained, dynamic and context-dependent adaptations can be particularly difficult to realize in long-lived, large-scale software systems. We argue that, in order to effectively and efficiently deploy such changes, adaptive applications must be built on an infrastructure that is not just model-driven, but is both model-centric and context-aware. Specifically, this means that high-level, causally-connected models of the application and the software infrastructure itself should be available at run-time, and that changes may need to be scoped to the run-time execution context. We first review the dimensions of software adaptation and evolution, and then we show how model-centric design can address the adaptation needs of a variety of applications that span these dimensions. We demonstrate through concrete examples how model-centric and context-aware designs work at the level of application interface, programming language and runtime. We then propose a research agenda for a model-centric development environment that supports dynamic software adaptation and evolution.
Resumo:
Object-oriented modelling languages such as EMOF are often used to specify domain specific meta-models. However, these modelling languages lack the ability to describe behavior or operational semantics. Several approaches have used a subset of Java mixed with OCL as executable meta-languages. In this experience report we show how we use Smalltalk as an executable meta-language in the context of the Moose reengineering environment. We present how we implemented EMOF and its behavioral aspects. Over the last decade we validated this approach through incrementally building a meta-described reengineering environment. Such an approach bridges the gap between a code-oriented view and a meta-model driven one. It avoids the creation of yet another language and reuses the infrastructure and run-time of the underlying implementation language. It offers an uniform way of letting developers focus on their tasks while at the same time allowing them to meta-describe their domain model. The advantage of our approach is that developers use the same tools and environment they use for their regular tasks. Still the approach is not Smalltalk specific but can be applied to language offering an introspective API such as Ruby, Python, CLOS, Java and C#.
Resumo:
One of the main roles of the Neural Open Markup Language, NeuroML, is to facilitate cooperation in building, simulating, testing and publishing models of channels, neurons and networks of neurons. MorphML, which was developed as a common format for exchange of neural morphology data, is distributed as part of NeuroML but can be used as a stand-alone application. In this collection of tutorials and workshop summary, we provide an overview of these XML schemas and provide examples of their use in down-stream applications. We also summarize plans for the further development of XML specifications for modeling channels, channel distributions, and network connectivity.